![]() | This is the talk page for the
Bot Approvals Group Guide. Changes to policy or the role of bag should be discussed at
WP:BOTPOL or
WT:BAG. This page is specifically to discuss the best practices related to the day-to-day operations of BAG, and act as a general BAG resource. |
I'm going to put this out of chronological order, because it'll make more sense in the archives later on. I'm going to ping all active and semi-active bag members to get their input on this guide.
This way no one will feel left out, and we'll get more feedback and ideas from people. I'm wondering if we should post notices on WP:BON/ WT:BOTPOL too for transparency's sake? Headbomb { talk / contribs / physics / books} 18:26, 12 February 2017 (UTC)
I feel we are splitting hairs a bit trying to differentiate ramp-up style trials from just the extended ones. Ramp-up is just an extended trial, when done in multiple steps. It's a neat way to phrase it and a common occurrence for complex impactful tasks, but do we really need the instruction creep? What about trials that last for several weeks if not months, because there are so few pages or edits are deadline-based? Or trials where the bot edits a small subset of pages, but for several days or weeks? Also, extended trial happens after a regular trial -- it's not a "longer trial", it's an "additional trial". A regular trial can be very long and an extended trial could be super-short to verify some issue that occurred in the first trial (that was only caught because the trial was long enough). How would that fit the guide? I like the idea, but I feel we shouldn't stray too far into formalizing any "kinds" of trials. Perhaps as additional notes and common practice examples.
P.S. It would be interesting to gather some stats on BRFAs -- trial count, conditions, any extensions, participants, etc. — HELLKNOWZ ▎ TALK 12:31, 12 February 2017 (UTC)
Well, initially I thought a ramp up approval would make sense in a "yeah seems fine, but just in case, have a ramp up rollout", but Hellknownz et al have made, I believe, a compelling case for having ramp up trials instead. So as far as best practices are concerned, ramp up as part of the trial makes more sense, since this implies the technical review, or the consensus gathering isn't quite over. Headbomb { talk / contribs / physics / books} 15:42, 18 February 2017 (UTC)
"If consensus has been demonstrated, or can reasonably presumed, BAG members have the discretion to allow the proposed bot to undergo trial to judge its technical soundness."
I feel this is too strongly worded for when consensus may not be clear or be WP:SILENCE. A short trial for technical verification, or to even garner further input, should be acceptable before consensus is reached. Leaving a BRFA open after trial to gather input is pretty standard. We've had editors unclear on this before that bots are being "approved" without consensus when they were simply trialed. I think it should make it clear that anything less than approval isn't approval and that a trial does not imply eventual approval. Bag should take care not to mislead the botop to code and run a task that they don't think will have consensus. But sometimes it's inevitable that issues are discovered and wider consensus is requested only once the trial runs. — HELLKNOWZ ▎ TALK 12:44, 12 February 2017 (UTC)
The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.
Bot designers can intend for bots to solicit and consume more human time or less human time. All things being equal, bots should consume less human time, be more discreet, and give priority to human activity over bot activity.
At Wikipedia_talk:Bot_policy#Bots_that_consume_user_time,_and_request_for_comment I am in a conversation which talks this through. The change that I would like is for the bot review process to require bot operators to self-report a likely minimum and likely maximum amount of human time which their bot will consume. I am not advocating for a particular cut off, but in general, a bot which does a high value activity and consumes less human time is better than a bot which does a lower value activity and more human time. For consumption of human labor to be part of the discussion we need a measurement of this, which is challenging, but I think that operator self-reporting during the approval process is a good place to start.
A common response which I hear to this proposal is "It is hard to measure how much human time a bot consumes, therefore by default we should assume that all bots consume zero human time and human time costs should not be a factor in considering the value of a bot." I want to push back against this perspective. I want to avoid any administrative burden on anyone, but as bots do more in wiki, we should establish some community norms on how much human time bots solicit. Blue Rasberry (talk) 15:55, 21 February 2018 (UTC)
And concerning "It is hard to measure how much human time a bot consumes, therefore by default we should assume that all bots consume zero human time and human time costs should not be a factor in considering the value of a bot.", no one has said that. What was said is that how much human time a bot consumes can't be measured, and trying to come up with estimates of that doesn't yield any insight on whether or not a task should be done. It's not that we assume such time is zero (it clearly isn't), it's that having a number for this (e.g. this bot task is estimate to require 100 person-hours out of volunteers) doesn't help make decisions in any way. People will bicker about whether something is 50 person hours, 100 person hours, 1000 person hours, waste time on refining the estimate to get more precise numbers, come up with various scenarios yield different estimates, ... for what is essentially a completely useless number. Headbomb { t · c · p · b} 14:27, 22 February 2018 (UTC)
![]() | This is the talk page for the
Bot Approvals Group Guide. Changes to policy or the role of bag should be discussed at
WP:BOTPOL or
WT:BAG. This page is specifically to discuss the best practices related to the day-to-day operations of BAG, and act as a general BAG resource. |
I'm going to put this out of chronological order, because it'll make more sense in the archives later on. I'm going to ping all active and semi-active bag members to get their input on this guide.
This way no one will feel left out, and we'll get more feedback and ideas from people. I'm wondering if we should post notices on WP:BON/ WT:BOTPOL too for transparency's sake? Headbomb { talk / contribs / physics / books} 18:26, 12 February 2017 (UTC)
I feel we are splitting hairs a bit trying to differentiate ramp-up style trials from just the extended ones. Ramp-up is just an extended trial, when done in multiple steps. It's a neat way to phrase it and a common occurrence for complex impactful tasks, but do we really need the instruction creep? What about trials that last for several weeks if not months, because there are so few pages or edits are deadline-based? Or trials where the bot edits a small subset of pages, but for several days or weeks? Also, extended trial happens after a regular trial -- it's not a "longer trial", it's an "additional trial". A regular trial can be very long and an extended trial could be super-short to verify some issue that occurred in the first trial (that was only caught because the trial was long enough). How would that fit the guide? I like the idea, but I feel we shouldn't stray too far into formalizing any "kinds" of trials. Perhaps as additional notes and common practice examples.
P.S. It would be interesting to gather some stats on BRFAs -- trial count, conditions, any extensions, participants, etc. — HELLKNOWZ ▎ TALK 12:31, 12 February 2017 (UTC)
Well, initially I thought a ramp up approval would make sense in a "yeah seems fine, but just in case, have a ramp up rollout", but Hellknownz et al have made, I believe, a compelling case for having ramp up trials instead. So as far as best practices are concerned, ramp up as part of the trial makes more sense, since this implies the technical review, or the consensus gathering isn't quite over. Headbomb { talk / contribs / physics / books} 15:42, 18 February 2017 (UTC)
"If consensus has been demonstrated, or can reasonably presumed, BAG members have the discretion to allow the proposed bot to undergo trial to judge its technical soundness."
I feel this is too strongly worded for when consensus may not be clear or be WP:SILENCE. A short trial for technical verification, or to even garner further input, should be acceptable before consensus is reached. Leaving a BRFA open after trial to gather input is pretty standard. We've had editors unclear on this before that bots are being "approved" without consensus when they were simply trialed. I think it should make it clear that anything less than approval isn't approval and that a trial does not imply eventual approval. Bag should take care not to mislead the botop to code and run a task that they don't think will have consensus. But sometimes it's inevitable that issues are discovered and wider consensus is requested only once the trial runs. — HELLKNOWZ ▎ TALK 12:44, 12 February 2017 (UTC)
The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.
Bot designers can intend for bots to solicit and consume more human time or less human time. All things being equal, bots should consume less human time, be more discreet, and give priority to human activity over bot activity.
At Wikipedia_talk:Bot_policy#Bots_that_consume_user_time,_and_request_for_comment I am in a conversation which talks this through. The change that I would like is for the bot review process to require bot operators to self-report a likely minimum and likely maximum amount of human time which their bot will consume. I am not advocating for a particular cut off, but in general, a bot which does a high value activity and consumes less human time is better than a bot which does a lower value activity and more human time. For consumption of human labor to be part of the discussion we need a measurement of this, which is challenging, but I think that operator self-reporting during the approval process is a good place to start.
A common response which I hear to this proposal is "It is hard to measure how much human time a bot consumes, therefore by default we should assume that all bots consume zero human time and human time costs should not be a factor in considering the value of a bot." I want to push back against this perspective. I want to avoid any administrative burden on anyone, but as bots do more in wiki, we should establish some community norms on how much human time bots solicit. Blue Rasberry (talk) 15:55, 21 February 2018 (UTC)
And concerning "It is hard to measure how much human time a bot consumes, therefore by default we should assume that all bots consume zero human time and human time costs should not be a factor in considering the value of a bot.", no one has said that. What was said is that how much human time a bot consumes can't be measured, and trying to come up with estimates of that doesn't yield any insight on whether or not a task should be done. It's not that we assume such time is zero (it clearly isn't), it's that having a number for this (e.g. this bot task is estimate to require 100 person-hours out of volunteers) doesn't help make decisions in any way. People will bicker about whether something is 50 person hours, 100 person hours, 1000 person hours, waste time on refining the estimate to get more precise numbers, come up with various scenarios yield different estimates, ... for what is essentially a completely useless number. Headbomb { t · c · p · b} 14:27, 22 February 2018 (UTC)