This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 5 | ← | Archive 9 | Archive 10 | Archive 11 | Archive 12 | Archive 13 | → | Archive 15 |
I would like to ask permission to do some interwiki botworks with my bot: RobotJcb. I run a multi-login bot, doing interwikis on several wikis. Today I allready did a few edits on EN.wikipedia with the robot: [1]. I did not now that permission was needed, I'm sorry. But you may review that 18 edits to see what kind of upkeep I would like to do. Jcbos 15:27, 8 October 2005 (UTC)
The existing policy on the project page is a bit vague on what interwiki bots are "known safe", etc., and I have posed some questions above that affect a lot of interwiki bots, including the two most recent requests for permission.
I propose the following slightly revised policy on Interwiki bots. I have posted an RFC pointing here. -- Beland 03:32, 9 October 2005 (UTC)
pywikipedia interwiki bot operators making links from the English Wikipedia:
Non-pywikipedia interwiki bot operators:
Other editors who see semantically inappropriate interwiki links being made by bots are encouraged to fix those links manually. In general, our feeling is that automatic links which are mostly right are better than no links at all. If one or more bots are making a high percentage of erroneous links, editors are encouraged to leave a note on Wikipedia talk:Bots and the talk page of the bot or bot operator. General feedback on interwiki bot performance is also welcome.
Kakashi Bot will be used for two purposes: One time requests & Marking short articles (less than 1K) as stubs. Anything that is less than 42 bytes will be marked with {{ db}} and anything less than 14 bytes will be auto-deleted under my account. This is per the discussion held at Wikipedia:Village pump (proposals)#Auto deletion of nonsense. -- AllyUnion (talk) 04:10, 9 October 2005 (UTC)
Repast The Tempest (Insane Clown Posse) Formal amendment Hay sweep Actual effects of invading Iraq W. Ralph Basham Adam bishop Harrowlfptwé Brancacci Chapel Acacia ant(Pseudomyrmex ferruginea) Cyberpedia Principles of Mathematics Moss-troopers Gunung Lambak
All of the articles above are created by people as a "test". On average there are 2-20 of these posts that keeps admins busy unnecesarily. All pages have one thing in common they are less than 16 bytes. 15 bytes is a magical number because it is the smallest article size posible to have a redirect. or #redirect A = 15 chars. -- Cool Cat Talk 02:30, 21 September 2005 (UTC)
We already have a way to detect such pages. That'll be my bot in #en.wikipedia.vandalism . It is trivial to add it a function to make it delete any page newly created with less than 15 bytes. I intend to do so, objections? -- Cool Cat Talk 02:30, 21 September 2005 (UTC)
It might be a good thing to avoid deleting anything that's a template. One example is a {{deletedpage}}
(which happens to be 15 chars, but no reason it couldn't have been shorter). Or, it might theoretically be possible to have a very short template that performs some logic based on the {{PAGENAME}}. --
Curps 09:01, 21 September 2005 (UTC)
Pages created in the past 10 minutes: Jamie Lidell Katy Lennon Cassa Rosso
Exeptions bot will not delete:
Given that how large (in bytes) should a newly created article be written to be kept?
objection to 15 limit - Please note: During debate on potential deletion of stub template redirects at WP:SFD and WP:TFD it is often easier - so as not to get the "this may be deleted" message on the real template - to replace the redirect message with a simple template message. Thus, if a redirect to {{ a}} was being debated, the nominated template would contain the text {{tfd}}{{a}}. 12 characters. Then there's pages containing simply {{copyvio}} - 11 characters. I can't think of any smaller possibilities, but they may exist - and 15 is thus too big! Grutness... wha? 01:49, 22 September 2005 (UTC)
Will this bot automatically post an explaination to creator's talk pages explaining why their page was deleted? Perhaps it should, if that's feasible. -- Aquillion 16:09, 26 September 2005 (UTC)
I have been requested by Cool Cat for his assistance, but in light that Cool Cat doesn't have admin powers, I have taken upon myself to write this bot. I have decided three levels for this bot: 15 bytes that are not redirects or templates will be deleted automatically by the bot. 42 bytes are automatically marked with {{ db}} with the delete reason of "Bot detected new article less than 42 bytes, possibly spam." Anything under 1k will be marked with the generic stub template. -- AllyUnion (talk) 23:57, 8 October 2005 (UTC)
Ok, so I'm not sure if this is the right place for this but I couldn't find any other talk pages devoted to Wikipedia Bots so here goes; if a bot grabs a page via the http://en.wikipedia.org/wiki/Special:Export method (both for ease-of-parsing, and to ease the load on the server), is there any way to submit its edits? By checking the page's source while editing it seems each submission must be accompanied by an opaque hash value. Is this actually enforced? Or is there something I'm missing here? Thanks in advance. porges 11:03, 13 October 2005 (UTC)
I would like for NotificationBot to be allowed run to notify uploaders about their image in Category:Images with unknown source and Category:Images with unknown copyright status. I would also like to have the bot use two (not yet created) templates: {{ No source notified}} & {{ no license notified}}. The only difference in these two templates is, the following text will be added: The user has been automatically notified by a bot. Also, two new categories will be created: Category:Images with unknown source - notified & Category:Images with unknown copyright status - notified with the cooresponding templates. These templates will replace the ones on the image and the bot will sign a date on the page to indicate when it first notified the person. I would also like for the bot to give second notification on the 5th or 6th day before the 7 day period, then final notification on the 7th day. On the 8th day, it will change the image page and add a {{ db}} with the reason of: "User already warned automatically 3 times about changing copyright information on image. Final notice was given yesterday, at the end of the 7 day period mark." This bot will run daily every midnight on UTC.
Oh, and the notification text will be something that looks like this:
First notice:
[[{{{1}}}|75px|center|]] | The image you uploaded, [[:{{{1}}}]], has no {{{2}}} information. Please correct the licensing information. Unless the copyright status is provided, the image will be marked for deletion seven days after this notice.
|
Second notice:
[[{{{1}}}|75px|center|]] | The image you uploaded, [[:{{{1}}}]], has no {{{2}}} information. This is the second notice. Unless the copyright status is provided, the image will be marked for deletion in 2 days after this second notice.
|
Third and final notice:
[[{{{1}}}|75px|center|]] | The image you uploaded, [[:{{{1}}}]], has no {{{2}}} information. This is the third and final notice. Unless the copyright status is provided, the image will be marked for deletion tomorrow.
|
-- AllyUnion (talk) 11:34, 16 October 2005 (UTC)
Revised notice [modified further, see source link below]:
The image you uploaded, [[:{{{1}}}]], has no {{{2}}} information. The image page currently doesn't specify who created the image, so the
copyright status is therefore unclear. If you have not created the image yourself then you need to argue that we have the right to use the image on Wikipedia (see copyright tagging below). If you have not created the image yourself then you should also specify where you found it, ie in most cases link to the website where you got it, and the terms of use for content from that page. See
Wikipedia:Image copyright tags for the full list of copyright tags that you can use. Unless the copyright status is provided, the image will be marked for deletion on {{{3}}}.
|
-- AllyUnion (talk) 04:22, 17 October 2005 (UTC)
You could also just skip repeat customers on the first run, on the assumption that it will take a few days to run through all 20,000 or however many images need to be processed, and that a lot of people will probably check their other images after becoming aware that this might happen. I don't know which alternative is worse (annoyance or ignorance) but be prepared for complaints if you post more than two or three messages of the same kind to someone's user page in a relatively short timeframe. In the long run, will the bot check these categories every hour or every day or something? If it's hourly or thereabouts, I wouldn't worry about posting multiple messages. After the first one or two, they should get the idea, and stop uploading without attribution. It would be nice to batch-process, but I'd hate for that to delay implementation of this bot. Images are being deleted all the time, so people need to get notices ASAP. -- Beland 02:17, 20 October 2005 (UTC)
There are some issues I'm trying to resolve. One of them is that the bot is over writing itself. My initial idea that I had for the project / program would that the bot would make a first pass on the category of images and move them into a notified category. After the 7 day period in the notified category, it would be presumed that the images can be deleted if they are still in that category. The problem now seems that I'd have to build up a list or category or a hash table based on repeat customers. A database may seem overkill, but it seems to me to be the most reasonable solution. We are talking about a great deal of images in that category that really need to be deleted and all the users need to be notified about them, even if they are not active no longer. This covers our butts from someone getting really pissed about an image deletion that they were not notified. It's more of a, "Yes, we did let you know, and you failed to do anything about it, so it's not our fault." More information on my new project board: User:AllyUnion/Project board -- AllyUnion (talk) 08:55, 20 October 2005 (UTC)
As part of the project Wikipedia:Dead_external_links I would like to fix 301 redirects. The bot will be run as User:KhiviBot.
This will be a manually assisted bot. It will be run using perl WWW-Mediawiki-Client-0.27. I believe cleaning up 301 redirects is a nice goal to have. Generating the list of url is a manual process since sometiimes the redirects might not be valid. Hence human intervention is needed to generate a list of url's. Once the url list is obtained then the bot can fix them.
Example of this is 114 instances of http://www.ex.ac.uk/trol/scol/ccleng.htm .
I notice User:Kakashi Bot is fixing double redirects like A -> B -> C, it sounds like by just assuming that A should point to C. Looking at Special:DoubleRedirects, this is incorrect maybe 1% of the time. And many of the problems seem to occur with loops or bad human edits. Have we decided that any collateral damage here is acceptable, and in general, bots should just make this assumption? That would obviate the need for an entire project. Unless the bot could flag loops - that would be really handy. -- Beland 03:33, 20 October 2005 (UTC)
After pondering what "safe" assumptions would be, I propose the following:
Then we'll see if there are any further complaints or problems. And yeah, querying a database directly would be great, though in the short run, I'm happy to have anything. Oh, and will someone be checking for double redirects that cannot be fixed automatically? It's fine with me if you want to just dump them into some category for random editors to fix. -- Beland 08:03, 26 October 2005 (UTC)
This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 5 | ← | Archive 9 | Archive 10 | Archive 11 | Archive 12 | Archive 13 | → | Archive 15 |
I would like to ask permission to do some interwiki botworks with my bot: RobotJcb. I run a multi-login bot, doing interwikis on several wikis. Today I allready did a few edits on EN.wikipedia with the robot: [1]. I did not now that permission was needed, I'm sorry. But you may review that 18 edits to see what kind of upkeep I would like to do. Jcbos 15:27, 8 October 2005 (UTC)
The existing policy on the project page is a bit vague on what interwiki bots are "known safe", etc., and I have posed some questions above that affect a lot of interwiki bots, including the two most recent requests for permission.
I propose the following slightly revised policy on Interwiki bots. I have posted an RFC pointing here. -- Beland 03:32, 9 October 2005 (UTC)
pywikipedia interwiki bot operators making links from the English Wikipedia:
Non-pywikipedia interwiki bot operators:
Other editors who see semantically inappropriate interwiki links being made by bots are encouraged to fix those links manually. In general, our feeling is that automatic links which are mostly right are better than no links at all. If one or more bots are making a high percentage of erroneous links, editors are encouraged to leave a note on Wikipedia talk:Bots and the talk page of the bot or bot operator. General feedback on interwiki bot performance is also welcome.
Kakashi Bot will be used for two purposes: One time requests & Marking short articles (less than 1K) as stubs. Anything that is less than 42 bytes will be marked with {{ db}} and anything less than 14 bytes will be auto-deleted under my account. This is per the discussion held at Wikipedia:Village pump (proposals)#Auto deletion of nonsense. -- AllyUnion (talk) 04:10, 9 October 2005 (UTC)
Repast The Tempest (Insane Clown Posse) Formal amendment Hay sweep Actual effects of invading Iraq W. Ralph Basham Adam bishop Harrowlfptwé Brancacci Chapel Acacia ant(Pseudomyrmex ferruginea) Cyberpedia Principles of Mathematics Moss-troopers Gunung Lambak
All of the articles above are created by people as a "test". On average there are 2-20 of these posts that keeps admins busy unnecesarily. All pages have one thing in common they are less than 16 bytes. 15 bytes is a magical number because it is the smallest article size posible to have a redirect. or #redirect A = 15 chars. -- Cool Cat Talk 02:30, 21 September 2005 (UTC)
We already have a way to detect such pages. That'll be my bot in #en.wikipedia.vandalism . It is trivial to add it a function to make it delete any page newly created with less than 15 bytes. I intend to do so, objections? -- Cool Cat Talk 02:30, 21 September 2005 (UTC)
It might be a good thing to avoid deleting anything that's a template. One example is a {{deletedpage}}
(which happens to be 15 chars, but no reason it couldn't have been shorter). Or, it might theoretically be possible to have a very short template that performs some logic based on the {{PAGENAME}}. --
Curps 09:01, 21 September 2005 (UTC)
Pages created in the past 10 minutes: Jamie Lidell Katy Lennon Cassa Rosso
Exeptions bot will not delete:
Given that how large (in bytes) should a newly created article be written to be kept?
objection to 15 limit - Please note: During debate on potential deletion of stub template redirects at WP:SFD and WP:TFD it is often easier - so as not to get the "this may be deleted" message on the real template - to replace the redirect message with a simple template message. Thus, if a redirect to {{ a}} was being debated, the nominated template would contain the text {{tfd}}{{a}}. 12 characters. Then there's pages containing simply {{copyvio}} - 11 characters. I can't think of any smaller possibilities, but they may exist - and 15 is thus too big! Grutness... wha? 01:49, 22 September 2005 (UTC)
Will this bot automatically post an explaination to creator's talk pages explaining why their page was deleted? Perhaps it should, if that's feasible. -- Aquillion 16:09, 26 September 2005 (UTC)
I have been requested by Cool Cat for his assistance, but in light that Cool Cat doesn't have admin powers, I have taken upon myself to write this bot. I have decided three levels for this bot: 15 bytes that are not redirects or templates will be deleted automatically by the bot. 42 bytes are automatically marked with {{ db}} with the delete reason of "Bot detected new article less than 42 bytes, possibly spam." Anything under 1k will be marked with the generic stub template. -- AllyUnion (talk) 23:57, 8 October 2005 (UTC)
Ok, so I'm not sure if this is the right place for this but I couldn't find any other talk pages devoted to Wikipedia Bots so here goes; if a bot grabs a page via the http://en.wikipedia.org/wiki/Special:Export method (both for ease-of-parsing, and to ease the load on the server), is there any way to submit its edits? By checking the page's source while editing it seems each submission must be accompanied by an opaque hash value. Is this actually enforced? Or is there something I'm missing here? Thanks in advance. porges 11:03, 13 October 2005 (UTC)
I would like for NotificationBot to be allowed run to notify uploaders about their image in Category:Images with unknown source and Category:Images with unknown copyright status. I would also like to have the bot use two (not yet created) templates: {{ No source notified}} & {{ no license notified}}. The only difference in these two templates is, the following text will be added: The user has been automatically notified by a bot. Also, two new categories will be created: Category:Images with unknown source - notified & Category:Images with unknown copyright status - notified with the cooresponding templates. These templates will replace the ones on the image and the bot will sign a date on the page to indicate when it first notified the person. I would also like for the bot to give second notification on the 5th or 6th day before the 7 day period, then final notification on the 7th day. On the 8th day, it will change the image page and add a {{ db}} with the reason of: "User already warned automatically 3 times about changing copyright information on image. Final notice was given yesterday, at the end of the 7 day period mark." This bot will run daily every midnight on UTC.
Oh, and the notification text will be something that looks like this:
First notice:
[[{{{1}}}|75px|center|]] | The image you uploaded, [[:{{{1}}}]], has no {{{2}}} information. Please correct the licensing information. Unless the copyright status is provided, the image will be marked for deletion seven days after this notice.
|
Second notice:
[[{{{1}}}|75px|center|]] | The image you uploaded, [[:{{{1}}}]], has no {{{2}}} information. This is the second notice. Unless the copyright status is provided, the image will be marked for deletion in 2 days after this second notice.
|
Third and final notice:
[[{{{1}}}|75px|center|]] | The image you uploaded, [[:{{{1}}}]], has no {{{2}}} information. This is the third and final notice. Unless the copyright status is provided, the image will be marked for deletion tomorrow.
|
-- AllyUnion (talk) 11:34, 16 October 2005 (UTC)
Revised notice [modified further, see source link below]:
The image you uploaded, [[:{{{1}}}]], has no {{{2}}} information. The image page currently doesn't specify who created the image, so the
copyright status is therefore unclear. If you have not created the image yourself then you need to argue that we have the right to use the image on Wikipedia (see copyright tagging below). If you have not created the image yourself then you should also specify where you found it, ie in most cases link to the website where you got it, and the terms of use for content from that page. See
Wikipedia:Image copyright tags for the full list of copyright tags that you can use. Unless the copyright status is provided, the image will be marked for deletion on {{{3}}}.
|
-- AllyUnion (talk) 04:22, 17 October 2005 (UTC)
You could also just skip repeat customers on the first run, on the assumption that it will take a few days to run through all 20,000 or however many images need to be processed, and that a lot of people will probably check their other images after becoming aware that this might happen. I don't know which alternative is worse (annoyance or ignorance) but be prepared for complaints if you post more than two or three messages of the same kind to someone's user page in a relatively short timeframe. In the long run, will the bot check these categories every hour or every day or something? If it's hourly or thereabouts, I wouldn't worry about posting multiple messages. After the first one or two, they should get the idea, and stop uploading without attribution. It would be nice to batch-process, but I'd hate for that to delay implementation of this bot. Images are being deleted all the time, so people need to get notices ASAP. -- Beland 02:17, 20 October 2005 (UTC)
There are some issues I'm trying to resolve. One of them is that the bot is over writing itself. My initial idea that I had for the project / program would that the bot would make a first pass on the category of images and move them into a notified category. After the 7 day period in the notified category, it would be presumed that the images can be deleted if they are still in that category. The problem now seems that I'd have to build up a list or category or a hash table based on repeat customers. A database may seem overkill, but it seems to me to be the most reasonable solution. We are talking about a great deal of images in that category that really need to be deleted and all the users need to be notified about them, even if they are not active no longer. This covers our butts from someone getting really pissed about an image deletion that they were not notified. It's more of a, "Yes, we did let you know, and you failed to do anything about it, so it's not our fault." More information on my new project board: User:AllyUnion/Project board -- AllyUnion (talk) 08:55, 20 October 2005 (UTC)
As part of the project Wikipedia:Dead_external_links I would like to fix 301 redirects. The bot will be run as User:KhiviBot.
This will be a manually assisted bot. It will be run using perl WWW-Mediawiki-Client-0.27. I believe cleaning up 301 redirects is a nice goal to have. Generating the list of url is a manual process since sometiimes the redirects might not be valid. Hence human intervention is needed to generate a list of url's. Once the url list is obtained then the bot can fix them.
Example of this is 114 instances of http://www.ex.ac.uk/trol/scol/ccleng.htm .
I notice User:Kakashi Bot is fixing double redirects like A -> B -> C, it sounds like by just assuming that A should point to C. Looking at Special:DoubleRedirects, this is incorrect maybe 1% of the time. And many of the problems seem to occur with loops or bad human edits. Have we decided that any collateral damage here is acceptable, and in general, bots should just make this assumption? That would obviate the need for an entire project. Unless the bot could flag loops - that would be really handy. -- Beland 03:33, 20 October 2005 (UTC)
After pondering what "safe" assumptions would be, I propose the following:
Then we'll see if there are any further complaints or problems. And yeah, querying a database directly would be great, though in the short run, I'm happy to have anything. Oh, and will someone be checking for double redirects that cannot be fixed automatically? It's fine with me if you want to just dump them into some category for random editors to fix. -- Beland 08:03, 26 October 2005 (UTC)