This is an annotated copy of a snapshot of Wikipedia approval mechanism. My comments are so extensive I thought they'd better go in a user page. I'm new to User subpages, so I hope this is an appropriate use for them.
It needs to be read in conjunction with the current version of that page. There are subsequent edits to that page which I have no intention of bringing here. See also Wikipedia talk:Wikipedia approval mechanism, which contains a lot of comments too, some chronologically before mine, some after.
This page is a work in progress, like anything on a Wiki, and my comments will probably be expanded even more in time. And if the proposal goes forward in any form, they will eventually get refactored into the software spec and other more appropriate places.
Please feel free to add your own comments, preferable signed and at least triple-indented. Then we can assume that unsigned double-indented comments are mine unless otherwise indicated. I went to double because in Larry Sanger's and other proposals below, they use single indents already.
The primary purpose here is to relate this existing discussion to the m:Referees proposal, henceforth just referred to as the proposal. Andrewa
"Wikipedia approval mechanism" means any sort of mechanism whereby Wikipedia articles are individually marked and displayed, somehow, as "approved."
The purpose of an approval mechanism is, essentially, quality assurance. By presenting particular articles as approved, we (Wikipedians) would be representing those articles as reliable sources of information.
Among the basic requirements of an approval mechanism would have to fulfill in order to be adequate are:
Some "desirements":
The advantages of an approval mechanism of the sort described are clear and numerous:
Generally, Wikipedia will become comparable to nearly any encyclopedia, once enough articles are approved. It need not be perfect. Just better than Britannica and Encarta and ODP and other CD or Web resources.
I am not sure there are any significant disadvantages of an approval mechanism, but idly, I think there might be one. I think that it's possible that Wikipedia might become more of an "exclusive club" than it is, if people start comparing nascent articles contributed by new contributors to the finished projects. I might not want to contribute two sentences about widgets if I think ten neat paragraphs, with references, is what is expected. Again, I don't know if this is really apt to be a problem.
Another general argument against is that this really doesn't seem necessary. An approval mechanism has been suggested since Day One of Wikipedia, and evidence aside that Wikipedia is working just fine, will probably continue to be suggested 'til kingdom come.
Below, we can develop some specific proposals for approval mechanisms.
This would be added on to any of the above approval proceses. After an article is approved, it would go into the database of approved articles. People would be able to access this from the web. After reading an article, the reader would be able to click on a link to disapprove of the article. After 5 (more, less?) people have disapproved of an article, the article goes through a reapproval process, in which only one expert must approve it, and then the nessessary applicable administrators.
It might also be possible to use some automated heuristics to identify "good" articles. This could be especially useful if the Wikipedia is being extracted to some static storage (e.g., a CD-ROM or PDA memory stick). Some users might want this view as well. The heuristics may throw away some of the latest "good" changes, as long as they also throw away most of the likely "bad" changes.
Here are a few possible automated heuristics:
These heuristics can be combined with the expert rating systems discussed elsewhere here. An advantage of these automated approaches is that they can be applied immediately.
Other automated heuristics can be developed by developing "trust metrics" for people. Instead of trying to rank every article (or as a supplement to doing so), rank the people. After all, someone who does good work on one article is more likely to do good work on another article. You could use a scheme like Advogato's, where people identify how much they respect (trust) someone else. You then flow down the graph to find out how much each person should be trusted. For more information, see Advogato's trust metric information. Even if the Advogato metric isn't perfect, it does show how a few individuals could list other people they trust, and over time use that to derive global information. The Advogato code is available - it's GPLed.
Another related issue might be automated heuristics that try to identify likely trouble spots (new articles or likely troublesome diffs). A trivial approach might be to have a not-publicly-known list of words that, if they're present in the new article or diffs, suggest that the change is probably a bad one. Examples include swear words, and words that indicate POV (e.g., "Jew" may suggest anti-semitism). The change might be fine, but such a flag would at least alert someone else to especially take a look there.
A more sophisticated approach to automatically identify trouble spots might be to use learning techniques to identify what's probably garbage, using typical text filtering and anti-spam techniques such as naive Bayesian filtering (see Paul Graham's "A Plan for Spam"). To do this, the Wikipedia would need to store deleted articles and have a way to mark changes that were removed for cause (e.g., were egregiously POV) - presumably this would be a sysop privilege. Then the Wikipedia could train on "known bad" and "known good" (perhaps assuming that all Wikipedia articles before some date, or meeting some criteria listed above, are "good"). Then it could look for bad changes (either in the future, or simply examining the entire Wikipedia offline).
These are arguments presented for why an additional approval mechanism is unnecessery for wikipedia:
This idea has some of the same principles as the Automated Heuristic suggested above. I agree that an automated method for determining "good" articles for offline readers is absolutely crucial. I have a different idea on how to go about it. I think the principles of easy editing and how wikipedia works now is what makes it great. I think we need to take those principles along with some search engine ideas to give a confidence level for documents. So people extracting the data for offline purposes can decide the confidence level they want and only extract articles that meet that confidence level.
I think the exact equation for the final scoring needs to be discussed. I don't think I could come up with a final version by myself, but I'll give an example of what would give good point and bad points.
Final Score: a: first thing we need it a quality/scoring value for editors. Anonymous editors would be given a value of 1 and a logged in user may get 1 point added to their value for each article he/she edits, up to a value of 100. b: 0.25 points for each time a user reads the article c: 0.25 point for each day the article has existed in wikipedia d: each time the article is edited it gets 1+(a/10)*2 points, anonymous user would give it 1.2 and a fully qualified user would give it 20 points. e: next if an anonymous user makes a large change then you get a -20 point deduction. Even though this is harsh, if it goes untouched for 80 days it will gain all those points back. It will gain the points back faster if a lot of people have read the article.
This is the best I can think of right now, if I come up with a better scoring system I'll make some changes. Anyone feel free to test score a couple of articles to see how this algorithm holds up. We can even get a way of turning the score to a percentage, so that people can extract 90% qualified articles.
Trolls are not here to approve, and usually reject views of experts who must be certified by someone trolls grumble about. So one would expect them to be disgruntled by definition about such a mechanism. However, paradoxically, almost all trolls think they apply clear and reasonably stringent standards. The problem is that each troll has his own standards, unlike those of others!
That said, there is much to agree on: the mechanism itself must be genuinely easy to use, nothing slow and rigorous is of any value, the progress of Wikipedia and its proven process should not be impeded, and the results of the approval can be ignored. Where trolls would disagree is that verifying the expert's credentials are of any value. Any such mechanism can be exploited, as trolls know full well, often being experts at forging new identities and the deliberate disruption of any credentialing mechanism.
One might ignore this, and the trolls, but, it remains that what goes on at Wikipedia is largely a process not of approval but of impulse and then disapproval. As with morality and diplomacy, we move from systems of informal to formal disapproval. Today, even our reality game shows demonstrate the broad utility of this approach, with disapproval voting of uninteresting or unwanted or undesired candidates a well-understood paradigm.
So, imagine an entirely different way to achieve the "desirements", one that is a natural extension of Wikipedia's present process of attempt (stubs, slanted first passes, public domain documents, broad rewrites of external texts) and disapproval (reverts, neutralizing, link adds, rewrites, NPOV dispute and deletions). Rather than something new (trolls hate what is new) and unproven that will simply repeat all the mistakes of academia. Imagine a mechanism that
By embracing and extending and formalizing the disapproval, boredom and disdain that all naturally feel as a part of misanthropy, we can arrive at a pure and effective knowledge resources. One that rarely tells us what we don't care about. And, potentially, one that can let us avoid those who we find untruthful.
Include articles that have been proposed by at least one person, and vetoed by none.
Where two versions of an article are so approved, pick the later one. Where no versions of an article are so approved, have no article.
That's it.
See m:Referees. This proposal is consistent with much of the above.
This is an annotated copy of a snapshot of Wikipedia approval mechanism. My comments are so extensive I thought they'd better go in a user page. I'm new to User subpages, so I hope this is an appropriate use for them.
It needs to be read in conjunction with the current version of that page. There are subsequent edits to that page which I have no intention of bringing here. See also Wikipedia talk:Wikipedia approval mechanism, which contains a lot of comments too, some chronologically before mine, some after.
This page is a work in progress, like anything on a Wiki, and my comments will probably be expanded even more in time. And if the proposal goes forward in any form, they will eventually get refactored into the software spec and other more appropriate places.
Please feel free to add your own comments, preferable signed and at least triple-indented. Then we can assume that unsigned double-indented comments are mine unless otherwise indicated. I went to double because in Larry Sanger's and other proposals below, they use single indents already.
The primary purpose here is to relate this existing discussion to the m:Referees proposal, henceforth just referred to as the proposal. Andrewa
"Wikipedia approval mechanism" means any sort of mechanism whereby Wikipedia articles are individually marked and displayed, somehow, as "approved."
The purpose of an approval mechanism is, essentially, quality assurance. By presenting particular articles as approved, we (Wikipedians) would be representing those articles as reliable sources of information.
Among the basic requirements of an approval mechanism would have to fulfill in order to be adequate are:
Some "desirements":
The advantages of an approval mechanism of the sort described are clear and numerous:
Generally, Wikipedia will become comparable to nearly any encyclopedia, once enough articles are approved. It need not be perfect. Just better than Britannica and Encarta and ODP and other CD or Web resources.
I am not sure there are any significant disadvantages of an approval mechanism, but idly, I think there might be one. I think that it's possible that Wikipedia might become more of an "exclusive club" than it is, if people start comparing nascent articles contributed by new contributors to the finished projects. I might not want to contribute two sentences about widgets if I think ten neat paragraphs, with references, is what is expected. Again, I don't know if this is really apt to be a problem.
Another general argument against is that this really doesn't seem necessary. An approval mechanism has been suggested since Day One of Wikipedia, and evidence aside that Wikipedia is working just fine, will probably continue to be suggested 'til kingdom come.
Below, we can develop some specific proposals for approval mechanisms.
This would be added on to any of the above approval proceses. After an article is approved, it would go into the database of approved articles. People would be able to access this from the web. After reading an article, the reader would be able to click on a link to disapprove of the article. After 5 (more, less?) people have disapproved of an article, the article goes through a reapproval process, in which only one expert must approve it, and then the nessessary applicable administrators.
It might also be possible to use some automated heuristics to identify "good" articles. This could be especially useful if the Wikipedia is being extracted to some static storage (e.g., a CD-ROM or PDA memory stick). Some users might want this view as well. The heuristics may throw away some of the latest "good" changes, as long as they also throw away most of the likely "bad" changes.
Here are a few possible automated heuristics:
These heuristics can be combined with the expert rating systems discussed elsewhere here. An advantage of these automated approaches is that they can be applied immediately.
Other automated heuristics can be developed by developing "trust metrics" for people. Instead of trying to rank every article (or as a supplement to doing so), rank the people. After all, someone who does good work on one article is more likely to do good work on another article. You could use a scheme like Advogato's, where people identify how much they respect (trust) someone else. You then flow down the graph to find out how much each person should be trusted. For more information, see Advogato's trust metric information. Even if the Advogato metric isn't perfect, it does show how a few individuals could list other people they trust, and over time use that to derive global information. The Advogato code is available - it's GPLed.
Another related issue might be automated heuristics that try to identify likely trouble spots (new articles or likely troublesome diffs). A trivial approach might be to have a not-publicly-known list of words that, if they're present in the new article or diffs, suggest that the change is probably a bad one. Examples include swear words, and words that indicate POV (e.g., "Jew" may suggest anti-semitism). The change might be fine, but such a flag would at least alert someone else to especially take a look there.
A more sophisticated approach to automatically identify trouble spots might be to use learning techniques to identify what's probably garbage, using typical text filtering and anti-spam techniques such as naive Bayesian filtering (see Paul Graham's "A Plan for Spam"). To do this, the Wikipedia would need to store deleted articles and have a way to mark changes that were removed for cause (e.g., were egregiously POV) - presumably this would be a sysop privilege. Then the Wikipedia could train on "known bad" and "known good" (perhaps assuming that all Wikipedia articles before some date, or meeting some criteria listed above, are "good"). Then it could look for bad changes (either in the future, or simply examining the entire Wikipedia offline).
These are arguments presented for why an additional approval mechanism is unnecessery for wikipedia:
This idea has some of the same principles as the Automated Heuristic suggested above. I agree that an automated method for determining "good" articles for offline readers is absolutely crucial. I have a different idea on how to go about it. I think the principles of easy editing and how wikipedia works now is what makes it great. I think we need to take those principles along with some search engine ideas to give a confidence level for documents. So people extracting the data for offline purposes can decide the confidence level they want and only extract articles that meet that confidence level.
I think the exact equation for the final scoring needs to be discussed. I don't think I could come up with a final version by myself, but I'll give an example of what would give good point and bad points.
Final Score: a: first thing we need it a quality/scoring value for editors. Anonymous editors would be given a value of 1 and a logged in user may get 1 point added to their value for each article he/she edits, up to a value of 100. b: 0.25 points for each time a user reads the article c: 0.25 point for each day the article has existed in wikipedia d: each time the article is edited it gets 1+(a/10)*2 points, anonymous user would give it 1.2 and a fully qualified user would give it 20 points. e: next if an anonymous user makes a large change then you get a -20 point deduction. Even though this is harsh, if it goes untouched for 80 days it will gain all those points back. It will gain the points back faster if a lot of people have read the article.
This is the best I can think of right now, if I come up with a better scoring system I'll make some changes. Anyone feel free to test score a couple of articles to see how this algorithm holds up. We can even get a way of turning the score to a percentage, so that people can extract 90% qualified articles.
Trolls are not here to approve, and usually reject views of experts who must be certified by someone trolls grumble about. So one would expect them to be disgruntled by definition about such a mechanism. However, paradoxically, almost all trolls think they apply clear and reasonably stringent standards. The problem is that each troll has his own standards, unlike those of others!
That said, there is much to agree on: the mechanism itself must be genuinely easy to use, nothing slow and rigorous is of any value, the progress of Wikipedia and its proven process should not be impeded, and the results of the approval can be ignored. Where trolls would disagree is that verifying the expert's credentials are of any value. Any such mechanism can be exploited, as trolls know full well, often being experts at forging new identities and the deliberate disruption of any credentialing mechanism.
One might ignore this, and the trolls, but, it remains that what goes on at Wikipedia is largely a process not of approval but of impulse and then disapproval. As with morality and diplomacy, we move from systems of informal to formal disapproval. Today, even our reality game shows demonstrate the broad utility of this approach, with disapproval voting of uninteresting or unwanted or undesired candidates a well-understood paradigm.
So, imagine an entirely different way to achieve the "desirements", one that is a natural extension of Wikipedia's present process of attempt (stubs, slanted first passes, public domain documents, broad rewrites of external texts) and disapproval (reverts, neutralizing, link adds, rewrites, NPOV dispute and deletions). Rather than something new (trolls hate what is new) and unproven that will simply repeat all the mistakes of academia. Imagine a mechanism that
By embracing and extending and formalizing the disapproval, boredom and disdain that all naturally feel as a part of misanthropy, we can arrive at a pure and effective knowledge resources. One that rarely tells us what we don't care about. And, potentially, one that can let us avoid those who we find untruthful.
Include articles that have been proposed by at least one person, and vetoed by none.
Where two versions of an article are so approved, pick the later one. Where no versions of an article are so approved, have no article.
That's it.
See m:Referees. This proposal is consistent with much of the above.