Operator: Jtmorgan ( talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 01:02, Saturday July 7, 2012 ( UTC)
Automatic, Supervised, or Manual: Automatic
Programming language(s): Python, uses WikiTools
Source code available: Source code available here: https://github.com/jtmorgan/hostbot/tree/master/new_editor_invites
Function overview: A proposed extension of HostBot's duties to include inviting selected new editors to participate in WP:Teahouse by posting an invite template on their talk pages.
Links to relevant discussions (where appropriate): Wikipedia_talk:Teahouse/Host_lounge/Archive_5#New_bot_proposal:_automated_invites._Input_requested.21
Edit period(s): Daily
Estimated number of pages affected: 70 - 100 pages per day
Exclusion compliant (Yes/No): Yes
Already has a bot flag (Yes/No): Yes
Function details: Teahouse relies on direct outreach to new editors for a great deal of its traffic. But manually inviting people via talk page templates is time-consuming. Since many invites need to be sent each day, invites are sent to a subset of new editors in a pre-filtered list, and the task is fairly tedious and without much personal interaction at present, we feel it is a good candidate to experiment with automation.
HostBot currently publishes a daily invitee report which consists of a list of ~50-100 new editors who meet a set of baseline criteria for invitation. These criteria are intended to screen out both account creators who don’t intend to edit seriously, and blatant vandals.
Currently, Teahouse hosts manually invite this set of editors using a standardized talk page invite template. Because of hosts’ other time commitments, many days no invitations are sent out at all, and so many of the promising new editors that the Teahouse could serve don’t hear about it. We would like to try automating the invite process in order to see if it increases good-faith only traffic to Teahouse and allows volunteers to spend time on more personal tasks, such as answering questions and welcoming new editors who do visit Teahouse.
We’re aware that automatic welcoming is one of the perennially rejected bot proposals, and that some might think automated invitations are similar. The reasons generally given for rejecting welcome bot proposal are:
These are valid concerns for welcoming on Wikipedia, as welcome templates are delivered to a majority of new editors each day, but they don’t apply to the situation of Teahouse invites.
We invite your feedback on how to make the automated invite process compliant with existing bot policies and best practices; we want to do this right. If approved, we intend to implement automatic Teahouse invitations via HostBot on a trial basis, for a period of 2 weeks. Bot behavior will be monitored daily by Teahouse hosts during this period, who will perform spot checks to make sure the bot is performing as designed. After the trial, we will assess whether this invitation process has led to an increase in vandalism, and/or whether the bot has had any other unforeseen impact.
{{ BAG assistance needed}} - No action taken on the request by the BAG for almost a week now. -- Nathan2055 talk - contribs 17:30, 10 July 2012 (UTC) reply
Approved for trial (14 days). Please provide a link to the relevant contributions and/or diffs when the trial is complete. —
The Earwig
(talk) 15:09, 13 July 2012 (UTC)
reply
Has the source code been published? It looks like it has some quirks in it. -- MZMcBride ( talk) 15:55, 15 July 2012 (UTC) reply
When I first joined Wikipedia, I was welcomed by a person. I still remember that. I'm not sold on the idea that having a welcome bot, even if limited by specific parameters (edit count, block status, etc.), is a good idea. Hasn't a good amount of the relevant research (by both your team and other groups) indicated that new users—in particular—value human interaction, not botspam? -- MZMcBride ( talk) 15:55, 15 July 2012 (UTC) reply
It would be helpful if you could clearly define what problems you're attempting to address with this bot. Fundamentally, using a welcome bot is a bad idea. I say so, the community has said so on countless occasions, and nothing in this request makes it clear that anything has changed that requires a re-evaluation of this position. You're certainly not the first bot operators to suggest adding filters/constraints to the input list. This is a perennially denied request for a reason.
If you can begin to define the problems you're trying to address, it might be helpful in developing actual solutions to those problems. What you've currently created is a wall of text to obfuscate the fact that you're simply re-proposing what has been previously (and rightly) rejected countless times. -- MZMcBride ( talk) 16:05, 15 July 2012 (UTC) reply
A few brief points (wouldn't want to be a textwaller!): first, the trial will need to be postponed a week anyway as I crawl my way out of Wikimania backlog/coma and get the new code written, tested and made available in an online repository, which I will then provide a link to (no arguments with you there, MZ). Second, I think that many people involved in this discussion and elsewhere will agree with me that the Teahouse itself is an attempt to address the "underlying issue" of new editors not getting sufficiently personalized welcomes. The invite template is simply the most expedient mechanism for directing new editors to a place where they can feel welcome and be welcomed by a person. If we can save hours of volunteer time by automating that process, we create more opportunities for friendly interaction on the Teahouse, which is a lot more satisfying for both guests and hosts. - J-Mo Talk to Me Email Me 20:01, 16 July 2012 (UTC) reply
I'm concerned that the number of posts I've made here (and the posts' lengths) make it seem as I though I care about this issue more than I actually do. I really don't care very much. I posted here because Jtmorgan asked me to take a look and I did. What I found was a proposal (a message delivery bot for new users) that the community has specifically rejected previously (to the point that people wrote documentation about it) and that I personally don't consider to be an appropriate use for a bot. There are appropriate and inappropriate times to use automation; for me, this is an inappropriate time (much like automatic spell-checking or automatic image de-linking). I think that leaving personalized messages is better and that working toward that goal would be a better use of time and resources. I also don't think there's much community consensus for adding a welcome bot (outside of Teahouse-related folks), but it's not my call. And maybe the community has finally changed its mind and doesn't mind. I'm not a member of the bot approvals group. One of those people (and a bureaucrat, I guess) will decide whether to approve or deny this bot. I'm hoping this is my last post here, but I'm making no promises. :-) -- MZMcBride ( talk) 22:41, 17 July 2012 (UTC) reply
I've checked the code for automatic invites into a subversion repository with Google Code (link above). Other HostBot scripts will live there as well, including the code that generates the list of new editors who are to be invited (the same script that runs the daily Invitee reports). I intend to run my first small-scale test tonight, on 10 new editors' pages, and monitor the outcome closely, reverting any errors that may occur. If nothing breaks, I'll set up a cron job to run the automatic invite script daily for the duration of the 2-week trial, through 8/5/2012. - J-Mo Talk to Me Email Me 23:03, 22 July 2012 (UTC) reply
Trial complete. You can view relevant contribs:
here. Will link to a couple examples of minor (fixed) errors tomorrow. -
J-Mo
Talk to Me
Email Me 03:44, 6 August 2012 (UTC)
reply
Just a headsup related to possibility for the watchlist: Let the bot watchlist all pages which got an invitation (a standard preferences at Special:Preferences#mw-prefsection-watchlist --> Add pages and files I edit to my watchlist) and publish the RSS token as described at Wikipedia:WATCHLIST#RSS feed. This is the easiest solution at the moment (without any tools) to following the new contributing user talk pages. BTW: What does the bot if somebody redirected (by accident, or wanted, e.g. by moving his talk page to mainspace; original used as sandbox) his own talk page to another page? mabdul 07:44, 6 August 2012 (UTC) reply
BAG Comment: As the overturning of a long-standing consensus, this should really be advertised on
WP:CENT or somesuch. -
Jarry1250
Deliberation
needed 12:39, 13 August 2012 (UTC)
reply
While I agree that this is technically not a welcoming bot, it does perform a closely-related function, so I'm not sure how fair it is to consider it as a wholly-separate issue. In light of the fact that the Teahouse emphasizes more personal interaction with new editors, it deserves careful consideration.
On the other hand, it's important to consider factors besides the fact that this is an automated process which can lead to it being perceived as impersonal—in particular, the template message. Currently, we are using what is essentially one template (regular one, plus the AFC variant) to invite users. It is the same template, usually with the same message, every time. As far as having a bot do this goes, it's really no worse than what we already have, except instead of a lot of people plastering the same message everywhere, it'll be a few people and a program plastering the same message everywhere. As I mentioned over at Meta (in the discussion of various Teahouse pages by all the hosts), we could really benefit from having a large variety of templates, message wordings, etc. to choose from. Hell, perhaps I'll start making some to use myself, and we'll see if they catch on. If the bot could cycle through various messages / template styles or randomly select one for each user, I would certainly support that. dalahäst ( let's talk!) 23:32, 13 August 2012 (UTC) reply
There is now one day left until the month of extended trial is up. The bot has been running and appears to be fine, but tomorrow the operator will call in and tell us any findings/bugs. After that, we seem to have general consensus for "yes, start writing invitations." Rcsprinter (whisper) @ 16:09, 14 September 2012 (UTC) reply
I've put together a report of findings from the last month of automated invites on the Meta research page. Looks like findings are consistent with those from the first report: automated invites get about the same response rate & block rate as manual invites. Interestingly, there's no significant difference between the performance of generic vs. personalized invites. And the August Teahouse metrics report shows that automated invites have substantially increased participation. There's plenty of room for discussion/interpretation on all of these fronts, obviously. Tho IMO we should probably hold any research-related discussion on the Teahouse Host Lounge so that more hosts can participate.
In response to editor feedback (see here, here and here) I've added some exclusion criteria into the script. The newest version is available at the google code link above. Specifically, HostBot now 'skips' a potential invitee if the following appear on their talkpage:
These changes have just gone into effect in the last few days, but are currently functioning as advertised. For example, this user was passed over for invitation today even though they appeared on the September 17th Teahouse invitee report because they had been served a level 4 vandalism warning. I'll continue to monitor the code for breaks and inappropriate edits, and I will continue to discuss updates, tweaks and new ideas with other community members! Jmorgan (WMF) ( talk) 01:56, 17 September 2012 (UTC) reply
I apologize for the late "objection" however I have a few questions regarding the bot task (I've been following it for a while, but never had any real time to look into it).
Again, I'm sorry for bringing these questions up so late in the approvals process, I really just haven't had time to look into the task as much as I had wanted to. Thanks, Lego Kontribs TalkM 07:10, 2 October 2012 (UTC) reply
Code review: the code isn't fully exclusion compliant. For example, if I used {{bots|deny=Legobot,HostBot}}
, HostBot would still edit the page. It's probably just easier to use the regex listed at
Template:Bots#Python.
Lego
Kontribs
TalkM 08:53, 3 October 2012 (UTC)
reply
A few more things I've found:
recordSkips()
(also lines 165-167) function isn't safe against injections. The proper way to implement it would be like this:def recordSkips():
for skipped in skip_list:
cursor.execute('''update jmorgan.th_up_invitees set hostbot_skipped = 1 where user_name = ?;
''', (skipped,))
conn.commit()
You really should be using cursor.executemany()
, however the above will sanitize your data. (
You can never be too careful!)
oursql
once I've migrated the code to its new server, and sanitize then.
Jmorgan (WMF) (
talk) 20:59, 7 October 2012 (UTC)
replytalkpageCheck()
, is there a reason you are using urllib2 to get the ?action=raw as opposed to the builtin wikitools.Page.getWikiText()
?urllib.quote_plus()
to do them all.
Lego
Kontribs
TalkM 09:16, 3 October 2012 (UTC)
replyoursql
(available on willow) over MySQLdb
you don't have to use the encodeCheck()
you have in line 55.except:
statement (L81). Anything from a connection/server error when using urllib2 to a unicode error in the text could trip it, and the error isn't logged anywhere so a human could potentially invite the user, it just gets lumped into the skipped group.
Lego
Kontribs
TalkM 02:59, 4 October 2012 (UTC)
replyinvitecheck.py
so that it adds 'skipped' to the invited? column of the
database report. This makes it more transparent that HostBot has skipped a user. I'll build in better non-latin char handling post-migration. Sounds like oursql
will help me there, too.
Jmorgan (WMF) (
talk) 20:59, 7 October 2012 (UTC)
replyAny updates? mabdul 19:00, 2 November 2012 (UTC) reply
Operator: Jtmorgan ( talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 01:02, Saturday July 7, 2012 ( UTC)
Automatic, Supervised, or Manual: Automatic
Programming language(s): Python, uses WikiTools
Source code available: Source code available here: https://github.com/jtmorgan/hostbot/tree/master/new_editor_invites
Function overview: A proposed extension of HostBot's duties to include inviting selected new editors to participate in WP:Teahouse by posting an invite template on their talk pages.
Links to relevant discussions (where appropriate): Wikipedia_talk:Teahouse/Host_lounge/Archive_5#New_bot_proposal:_automated_invites._Input_requested.21
Edit period(s): Daily
Estimated number of pages affected: 70 - 100 pages per day
Exclusion compliant (Yes/No): Yes
Already has a bot flag (Yes/No): Yes
Function details: Teahouse relies on direct outreach to new editors for a great deal of its traffic. But manually inviting people via talk page templates is time-consuming. Since many invites need to be sent each day, invites are sent to a subset of new editors in a pre-filtered list, and the task is fairly tedious and without much personal interaction at present, we feel it is a good candidate to experiment with automation.
HostBot currently publishes a daily invitee report which consists of a list of ~50-100 new editors who meet a set of baseline criteria for invitation. These criteria are intended to screen out both account creators who don’t intend to edit seriously, and blatant vandals.
Currently, Teahouse hosts manually invite this set of editors using a standardized talk page invite template. Because of hosts’ other time commitments, many days no invitations are sent out at all, and so many of the promising new editors that the Teahouse could serve don’t hear about it. We would like to try automating the invite process in order to see if it increases good-faith only traffic to Teahouse and allows volunteers to spend time on more personal tasks, such as answering questions and welcoming new editors who do visit Teahouse.
We’re aware that automatic welcoming is one of the perennially rejected bot proposals, and that some might think automated invitations are similar. The reasons generally given for rejecting welcome bot proposal are:
These are valid concerns for welcoming on Wikipedia, as welcome templates are delivered to a majority of new editors each day, but they don’t apply to the situation of Teahouse invites.
We invite your feedback on how to make the automated invite process compliant with existing bot policies and best practices; we want to do this right. If approved, we intend to implement automatic Teahouse invitations via HostBot on a trial basis, for a period of 2 weeks. Bot behavior will be monitored daily by Teahouse hosts during this period, who will perform spot checks to make sure the bot is performing as designed. After the trial, we will assess whether this invitation process has led to an increase in vandalism, and/or whether the bot has had any other unforeseen impact.
{{ BAG assistance needed}} - No action taken on the request by the BAG for almost a week now. -- Nathan2055 talk - contribs 17:30, 10 July 2012 (UTC) reply
Approved for trial (14 days). Please provide a link to the relevant contributions and/or diffs when the trial is complete. —
The Earwig
(talk) 15:09, 13 July 2012 (UTC)
reply
Has the source code been published? It looks like it has some quirks in it. -- MZMcBride ( talk) 15:55, 15 July 2012 (UTC) reply
When I first joined Wikipedia, I was welcomed by a person. I still remember that. I'm not sold on the idea that having a welcome bot, even if limited by specific parameters (edit count, block status, etc.), is a good idea. Hasn't a good amount of the relevant research (by both your team and other groups) indicated that new users—in particular—value human interaction, not botspam? -- MZMcBride ( talk) 15:55, 15 July 2012 (UTC) reply
It would be helpful if you could clearly define what problems you're attempting to address with this bot. Fundamentally, using a welcome bot is a bad idea. I say so, the community has said so on countless occasions, and nothing in this request makes it clear that anything has changed that requires a re-evaluation of this position. You're certainly not the first bot operators to suggest adding filters/constraints to the input list. This is a perennially denied request for a reason.
If you can begin to define the problems you're trying to address, it might be helpful in developing actual solutions to those problems. What you've currently created is a wall of text to obfuscate the fact that you're simply re-proposing what has been previously (and rightly) rejected countless times. -- MZMcBride ( talk) 16:05, 15 July 2012 (UTC) reply
A few brief points (wouldn't want to be a textwaller!): first, the trial will need to be postponed a week anyway as I crawl my way out of Wikimania backlog/coma and get the new code written, tested and made available in an online repository, which I will then provide a link to (no arguments with you there, MZ). Second, I think that many people involved in this discussion and elsewhere will agree with me that the Teahouse itself is an attempt to address the "underlying issue" of new editors not getting sufficiently personalized welcomes. The invite template is simply the most expedient mechanism for directing new editors to a place where they can feel welcome and be welcomed by a person. If we can save hours of volunteer time by automating that process, we create more opportunities for friendly interaction on the Teahouse, which is a lot more satisfying for both guests and hosts. - J-Mo Talk to Me Email Me 20:01, 16 July 2012 (UTC) reply
I'm concerned that the number of posts I've made here (and the posts' lengths) make it seem as I though I care about this issue more than I actually do. I really don't care very much. I posted here because Jtmorgan asked me to take a look and I did. What I found was a proposal (a message delivery bot for new users) that the community has specifically rejected previously (to the point that people wrote documentation about it) and that I personally don't consider to be an appropriate use for a bot. There are appropriate and inappropriate times to use automation; for me, this is an inappropriate time (much like automatic spell-checking or automatic image de-linking). I think that leaving personalized messages is better and that working toward that goal would be a better use of time and resources. I also don't think there's much community consensus for adding a welcome bot (outside of Teahouse-related folks), but it's not my call. And maybe the community has finally changed its mind and doesn't mind. I'm not a member of the bot approvals group. One of those people (and a bureaucrat, I guess) will decide whether to approve or deny this bot. I'm hoping this is my last post here, but I'm making no promises. :-) -- MZMcBride ( talk) 22:41, 17 July 2012 (UTC) reply
I've checked the code for automatic invites into a subversion repository with Google Code (link above). Other HostBot scripts will live there as well, including the code that generates the list of new editors who are to be invited (the same script that runs the daily Invitee reports). I intend to run my first small-scale test tonight, on 10 new editors' pages, and monitor the outcome closely, reverting any errors that may occur. If nothing breaks, I'll set up a cron job to run the automatic invite script daily for the duration of the 2-week trial, through 8/5/2012. - J-Mo Talk to Me Email Me 23:03, 22 July 2012 (UTC) reply
Trial complete. You can view relevant contribs:
here. Will link to a couple examples of minor (fixed) errors tomorrow. -
J-Mo
Talk to Me
Email Me 03:44, 6 August 2012 (UTC)
reply
Just a headsup related to possibility for the watchlist: Let the bot watchlist all pages which got an invitation (a standard preferences at Special:Preferences#mw-prefsection-watchlist --> Add pages and files I edit to my watchlist) and publish the RSS token as described at Wikipedia:WATCHLIST#RSS feed. This is the easiest solution at the moment (without any tools) to following the new contributing user talk pages. BTW: What does the bot if somebody redirected (by accident, or wanted, e.g. by moving his talk page to mainspace; original used as sandbox) his own talk page to another page? mabdul 07:44, 6 August 2012 (UTC) reply
BAG Comment: As the overturning of a long-standing consensus, this should really be advertised on
WP:CENT or somesuch. -
Jarry1250
Deliberation
needed 12:39, 13 August 2012 (UTC)
reply
While I agree that this is technically not a welcoming bot, it does perform a closely-related function, so I'm not sure how fair it is to consider it as a wholly-separate issue. In light of the fact that the Teahouse emphasizes more personal interaction with new editors, it deserves careful consideration.
On the other hand, it's important to consider factors besides the fact that this is an automated process which can lead to it being perceived as impersonal—in particular, the template message. Currently, we are using what is essentially one template (regular one, plus the AFC variant) to invite users. It is the same template, usually with the same message, every time. As far as having a bot do this goes, it's really no worse than what we already have, except instead of a lot of people plastering the same message everywhere, it'll be a few people and a program plastering the same message everywhere. As I mentioned over at Meta (in the discussion of various Teahouse pages by all the hosts), we could really benefit from having a large variety of templates, message wordings, etc. to choose from. Hell, perhaps I'll start making some to use myself, and we'll see if they catch on. If the bot could cycle through various messages / template styles or randomly select one for each user, I would certainly support that. dalahäst ( let's talk!) 23:32, 13 August 2012 (UTC) reply
There is now one day left until the month of extended trial is up. The bot has been running and appears to be fine, but tomorrow the operator will call in and tell us any findings/bugs. After that, we seem to have general consensus for "yes, start writing invitations." Rcsprinter (whisper) @ 16:09, 14 September 2012 (UTC) reply
I've put together a report of findings from the last month of automated invites on the Meta research page. Looks like findings are consistent with those from the first report: automated invites get about the same response rate & block rate as manual invites. Interestingly, there's no significant difference between the performance of generic vs. personalized invites. And the August Teahouse metrics report shows that automated invites have substantially increased participation. There's plenty of room for discussion/interpretation on all of these fronts, obviously. Tho IMO we should probably hold any research-related discussion on the Teahouse Host Lounge so that more hosts can participate.
In response to editor feedback (see here, here and here) I've added some exclusion criteria into the script. The newest version is available at the google code link above. Specifically, HostBot now 'skips' a potential invitee if the following appear on their talkpage:
These changes have just gone into effect in the last few days, but are currently functioning as advertised. For example, this user was passed over for invitation today even though they appeared on the September 17th Teahouse invitee report because they had been served a level 4 vandalism warning. I'll continue to monitor the code for breaks and inappropriate edits, and I will continue to discuss updates, tweaks and new ideas with other community members! Jmorgan (WMF) ( talk) 01:56, 17 September 2012 (UTC) reply
I apologize for the late "objection" however I have a few questions regarding the bot task (I've been following it for a while, but never had any real time to look into it).
Again, I'm sorry for bringing these questions up so late in the approvals process, I really just haven't had time to look into the task as much as I had wanted to. Thanks, Lego Kontribs TalkM 07:10, 2 October 2012 (UTC) reply
Code review: the code isn't fully exclusion compliant. For example, if I used {{bots|deny=Legobot,HostBot}}
, HostBot would still edit the page. It's probably just easier to use the regex listed at
Template:Bots#Python.
Lego
Kontribs
TalkM 08:53, 3 October 2012 (UTC)
reply
A few more things I've found:
recordSkips()
(also lines 165-167) function isn't safe against injections. The proper way to implement it would be like this:def recordSkips():
for skipped in skip_list:
cursor.execute('''update jmorgan.th_up_invitees set hostbot_skipped = 1 where user_name = ?;
''', (skipped,))
conn.commit()
You really should be using cursor.executemany()
, however the above will sanitize your data. (
You can never be too careful!)
oursql
once I've migrated the code to its new server, and sanitize then.
Jmorgan (WMF) (
talk) 20:59, 7 October 2012 (UTC)
replytalkpageCheck()
, is there a reason you are using urllib2 to get the ?action=raw as opposed to the builtin wikitools.Page.getWikiText()
?urllib.quote_plus()
to do them all.
Lego
Kontribs
TalkM 09:16, 3 October 2012 (UTC)
replyoursql
(available on willow) over MySQLdb
you don't have to use the encodeCheck()
you have in line 55.except:
statement (L81). Anything from a connection/server error when using urllib2 to a unicode error in the text could trip it, and the error isn't logged anywhere so a human could potentially invite the user, it just gets lumped into the skipped group.
Lego
Kontribs
TalkM 02:59, 4 October 2012 (UTC)
replyinvitecheck.py
so that it adds 'skipped' to the invited? column of the
database report. This makes it more transparent that HostBot has skipped a user. I'll build in better non-latin char handling post-migration. Sounds like oursql
will help me there, too.
Jmorgan (WMF) (
talk) 20:59, 7 October 2012 (UTC)
replyAny updates? mabdul 19:00, 2 November 2012 (UTC) reply