This page is for discussion regarding the CollabRC project. To open a new discussion thread, please create a new section at the bottom of this page. All feedback is welcome and will be responded to as time permits.
The "Learning reversion patterns" section used to have a segment written as follows:
To accommodate learning, every time a user reviews a change, one of three results must be chosen (though they may be implicit): vandalism, not obvious vandalism, and certainly not vandalism. These results are fed back to the collaboration bot; the vandalism and certainly not vandalism results are taken into account for adapting the working knowledge of the AI of the bot. A not obvious vandalism result informs the bot that it may have been correct in its analysis, but the change should not be rolled back because it is not obvious vandalism. In such a situation, the AI system will not adapt because inconclusive evidence is given as to whether the result was correct or not.
After discussing with The Thing That Should Not Be, it was determined that this was over-aggressive and might result in a slowdown of recent change patrollers. Furthermore, the AI bot does not necessarily need to know the difference between not obvious vandalism and certainly not vandalism. Accordingly, this section will be changed to be more similar to Huggle's tactics -- either a given change is vandalism or it is not. Since the bot should only be triggering on high-priority threats, anything that would have fit under not obvious vandalism should not be a trigger, and thus must be fed back to the bot anyway. Any feedback regarding this change is appreciated. -- Shirik ( talk) 06:32, 16 December 2009 (UTC)
Tim1357 was kind enough to point me to some sets of known vandalism and non-vandalism which should be used to initially train the bot. These datasets are courtesy of Crispy1989 and were originally intended for ClueBot.
Thanks to those whom have worked to build these lists. The information will be truly helpful. Shirik ( talk) 04:00, 17 December 2009 (UTC)
Is the bit about blacklisting editors who have been reverted using Huggle or Twinkle meant literally? I don't, and won't, use either. How about "reverted by a sysop or rollbacker"? Philip Trueman ( talk) 18:51, 12 April 2010 (UTC)
Pages linked to from the main page, highly-used templates, and bios of famous people are important, pages about rivers and little-known books are less important. New pages sometimes get created for pure vandalism, or are really messed up when they first get made. If there were some way to prioritize the importance of a page to the project, wouldn't it be possible to keep a closer watch on important pages? Keep the current system in place, and have the same settings for "absolutely vandalism", but make it easier for new/important pages to trip the "might be vandalism" part. After a week or so, take new pages off the watchlist. Math321 ( talk) 21:00, 23 February 2012 (UTC)
This page is for discussion regarding the CollabRC project. To open a new discussion thread, please create a new section at the bottom of this page. All feedback is welcome and will be responded to as time permits.
The "Learning reversion patterns" section used to have a segment written as follows:
To accommodate learning, every time a user reviews a change, one of three results must be chosen (though they may be implicit): vandalism, not obvious vandalism, and certainly not vandalism. These results are fed back to the collaboration bot; the vandalism and certainly not vandalism results are taken into account for adapting the working knowledge of the AI of the bot. A not obvious vandalism result informs the bot that it may have been correct in its analysis, but the change should not be rolled back because it is not obvious vandalism. In such a situation, the AI system will not adapt because inconclusive evidence is given as to whether the result was correct or not.
After discussing with The Thing That Should Not Be, it was determined that this was over-aggressive and might result in a slowdown of recent change patrollers. Furthermore, the AI bot does not necessarily need to know the difference between not obvious vandalism and certainly not vandalism. Accordingly, this section will be changed to be more similar to Huggle's tactics -- either a given change is vandalism or it is not. Since the bot should only be triggering on high-priority threats, anything that would have fit under not obvious vandalism should not be a trigger, and thus must be fed back to the bot anyway. Any feedback regarding this change is appreciated. -- Shirik ( talk) 06:32, 16 December 2009 (UTC)
Tim1357 was kind enough to point me to some sets of known vandalism and non-vandalism which should be used to initially train the bot. These datasets are courtesy of Crispy1989 and were originally intended for ClueBot.
Thanks to those whom have worked to build these lists. The information will be truly helpful. Shirik ( talk) 04:00, 17 December 2009 (UTC)
Is the bit about blacklisting editors who have been reverted using Huggle or Twinkle meant literally? I don't, and won't, use either. How about "reverted by a sysop or rollbacker"? Philip Trueman ( talk) 18:51, 12 April 2010 (UTC)
Pages linked to from the main page, highly-used templates, and bios of famous people are important, pages about rivers and little-known books are less important. New pages sometimes get created for pure vandalism, or are really messed up when they first get made. If there were some way to prioritize the importance of a page to the project, wouldn't it be possible to keep a closer watch on important pages? Keep the current system in place, and have the same settings for "absolutely vandalism", but make it easier for new/important pages to trip the "might be vandalism" part. After a week or so, take new pages off the watchlist. Math321 ( talk) 21:00, 23 February 2012 (UTC)