This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | Archive 2 | Archive 3 | → | Archive 5 |
Yes the first post on the bot board! I have come across quite few user pages of bots that give little to no description of what the bot does. I'm guessing not all have them because the process of getting a bot approved hasn't always existed. I am suggesting creating a talk message template to put on bot owners talk pages to ask them to add more information regarding their bot on the talk page. Relevant information should include bot owner, all jobs that are carried out, and the language(s) the bot uses.-- Andeh 18:41, 22 October 2006 (UTC)
I was starting to work to reconcile All flagged bots with WP:RBOTs and it seems that we may have numerous bots that are no longer in operation. Any comments on deflaging these accounts? — xaosflux Talk 04:58, 22 October 2006 (UTC)
About 1/2 of all bots are currently active, see User:Voice_of_All/Bots. Voice-of-All 06:31, 23 October 2006 (UTC)
I created a userbox for a bot owner. Use it if you like. -- Ganeshk ( talk) 21:24, 22 October 2006 (UTC)
It produces:
This user runs a bot, Ganeshbot ( contribs). It performs tasks that are extremely tedious to do manually. |
Maybe someone watching this page can help me. I'm converting old-style {{ PDFlink}} usage to the new style. The old style had a variety of ways that it could be, including
[http://LINKGOESHERE] {{pdflink}} [http://LINKGOESHERE]({{pdflink}}) {{pdflink}}[http://LINKGOESHERE] ({{pdflink}}) [http://LINKGOESHERE]
The new style is standardized, and looks like this:
{{tlp|pdflink|[http://LINKGOESHERE]}}
I was trying to create a regex (or two) to convert it, and I came up with these:
Find | Replace |
---|---|
(\(|)\{\{(PDFlink|Pdf|Pdflink)\}\}(\)|)( |)(\[http(.*?)\]) | {{$2|$5}} |
(\[http(.*?)\])( |)(\(|)\{\{(PDFlink|Pdf|Pdflink)\}\}(\)|) | {{$5|$1}} |
to match the two general cases (link before or after template). The only problem is that if there are two links in one paragraph, it will match the entire thing as one link (for example, see this sandbox edit). Can anyone help? — Mets501 ( talk) 15:18, 29 October 2006 (UTC)
Is there a Python code somewhere that checks for a new message on the talk page and stops the bot? A list of these little code snippets would be really useful. Thanks, Ganeshk ( talk) 23:50, 29 October 2006 (UTC)
I am setting up the India project template as a mini talkpage template. I am trying to get rid of extra spaces on my template, User:Ganeshk/sandbox2. Please check the talk page. I want to get rid of the space between the boxes in the userboxes on the right. Could anyone please check the template and help me fix the problem? -- Ganeshk ( talk) 08:02, 3 November 2006 (UTC)
Does anyone know of an efficient way to get only the redirects to a certain page in python (in the pywikipedia package)? I'm using
referredPageTitle = wikipedia.input(u'Links to which page should be processed?') referredPage = wikipedia.Page(wikipedia.getSite(), referredPageTitle) gen = pagegenerators.ReferringPageGenerator(referredPage) gen = pagegenerators.RedirectOnlyPageGenerator(gen)
now, but it's really slow, and there has to be a better way. I created RedirectOnlyPageGenerator as
class RedirectOnlyPageGenerator: """ Wraps around another generator. Yields only those pages that are redirects. """ def __init__(self, generator): self.generator = generator def __iter__(self): for page in self.generator: if page.isRedirectPage(): yield page
— Mets501 ( talk) 19:14, 12 November 2006 (UTC)
No exactly for writing a bot here (more - a program :P), but I though some people here might know. How do you put pages on the watchlist? I've got wpWatchThis= in the post string, but does "true/false" or "on/off" go after the brackets. I'll try some testing if no-one knows. M a rtinp23 20:30, 18 November 2006 (UTC)
I'm wondering about the possibility of better-standardised edit summaries for bots and scripts; please leave any comments at Wikipedia talk:WikiProject User scripts#Arrows in edit summaries. -- ais523 13:16, 4 December 2006 ( U T C)
Hi, I am trying to develop a helpful little bot called PockBot but I've not got massive amounts of experience with Perl or Wikipedia. I am trying to make the bot post a new comment to a given page (after doing a bunch of stuff irrelevant to this problem). However, I have run up against all kinds of problems with edit tokens etc.
I understand that you need to use a GET request to get the form-field page, screen-scrape the edit token off it, and then use this in submitting a second GET request to post the actual comment. However, I've found that if I have the bot make the same HTTP GET request as I manually type in myself, I am presented with an edit token and the bot isn't and so when the bot tries to write its data to the page, on viewing the page post-edit the page is as before, ie the edit has been ignored.
Does anybody have any simple Perl code chunk for performing these actions? I presume that everyone must come up against this hurdle. I have tried looking at Pearle bot but couldn't get code chunks from that to work either.
I'm not a complete novice perl coder, but i'm not a coding expert either. Any help appreciated!
PocklingtonDan 17:48, 5 December 2006 (UTC)
Before I switched to pywikipedia and used Perl, I found a very neat module: HTML::Form. It can parse the HTML text and extract forms:
use HTML::Form; $form = HTML::Form->parse($html, $base_uri); $form->value(query => "Perl");
You can then modify the form any way you want (read the documentation to see how) and finally, it's "click" method provides a ready-to-use-by-a-user-agent HTTP request:
use LWP::UserAgent; $ua = LWP::UserAgent->new; $response = $ua->request($form->click);
Hope it helps. М иша 13 18:20, 5 December 2006 (UTC)
use HTML::Form; my $ua = LWP::UserAgent->new; my $response = $ua->get("http://en.wikipedia.org/?title=Category_talk:Roman_frontiers&action=edit§ion=new"); my $form = HTML::Form->parse($response); my $text = $form->find_input('wpTextbox1')->value; my $summary = $form->find_input('wpSummary')->value; my $save = $form->find_input('wpSave')->value; my $edittoken = $form->find_input('wpEditToken')->value; my $starttime = $form->find_input('wpStarttime')->value; my $edittime = $form->find_input('wpEdittime')->value; print "Content-type: text/html\n\n"; print "Text field: $text<br><br>"; print "Summary: $summary<br><br>"; print "Save: $save<br><br>"; print "Edit token: $edittoken<br><br>"; print "Start Time: $starttime<br><br>"; print "Edit Time: $edittime<br><br>"; exit;
Well, for starters, try my Perl bot framework, Perlwikipedia. That should give you all the code you need to edit pages, without all the mucking about with edit tokens. Shadow1 (talk) 00:55, 6 December 2006 (UTC)
Oh, by the way: As far as I can tell, the trailing slash is part of the edit token. Shadow1 (talk) 18:49, 6 December 2006 (UTC)
Incidentally, I have found the explanation of this mysterious backslash at the end of the edit token: r18112. Tizio 16:03, 10 December 2006 (UTC)
I've created this template as a more specialized version of {{ bot}} for AWB users. MaxSem 16:16, 30 December 2006 (UTC)
This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | Archive 2 | Archive 3 | → | Archive 5 |
Yes the first post on the bot board! I have come across quite few user pages of bots that give little to no description of what the bot does. I'm guessing not all have them because the process of getting a bot approved hasn't always existed. I am suggesting creating a talk message template to put on bot owners talk pages to ask them to add more information regarding their bot on the talk page. Relevant information should include bot owner, all jobs that are carried out, and the language(s) the bot uses.-- Andeh 18:41, 22 October 2006 (UTC)
I was starting to work to reconcile All flagged bots with WP:RBOTs and it seems that we may have numerous bots that are no longer in operation. Any comments on deflaging these accounts? — xaosflux Talk 04:58, 22 October 2006 (UTC)
About 1/2 of all bots are currently active, see User:Voice_of_All/Bots. Voice-of-All 06:31, 23 October 2006 (UTC)
I created a userbox for a bot owner. Use it if you like. -- Ganeshk ( talk) 21:24, 22 October 2006 (UTC)
It produces:
This user runs a bot, Ganeshbot ( contribs). It performs tasks that are extremely tedious to do manually. |
Maybe someone watching this page can help me. I'm converting old-style {{ PDFlink}} usage to the new style. The old style had a variety of ways that it could be, including
[http://LINKGOESHERE] {{pdflink}} [http://LINKGOESHERE]({{pdflink}}) {{pdflink}}[http://LINKGOESHERE] ({{pdflink}}) [http://LINKGOESHERE]
The new style is standardized, and looks like this:
{{tlp|pdflink|[http://LINKGOESHERE]}}
I was trying to create a regex (or two) to convert it, and I came up with these:
Find | Replace |
---|---|
(\(|)\{\{(PDFlink|Pdf|Pdflink)\}\}(\)|)( |)(\[http(.*?)\]) | {{$2|$5}} |
(\[http(.*?)\])( |)(\(|)\{\{(PDFlink|Pdf|Pdflink)\}\}(\)|) | {{$5|$1}} |
to match the two general cases (link before or after template). The only problem is that if there are two links in one paragraph, it will match the entire thing as one link (for example, see this sandbox edit). Can anyone help? — Mets501 ( talk) 15:18, 29 October 2006 (UTC)
Is there a Python code somewhere that checks for a new message on the talk page and stops the bot? A list of these little code snippets would be really useful. Thanks, Ganeshk ( talk) 23:50, 29 October 2006 (UTC)
I am setting up the India project template as a mini talkpage template. I am trying to get rid of extra spaces on my template, User:Ganeshk/sandbox2. Please check the talk page. I want to get rid of the space between the boxes in the userboxes on the right. Could anyone please check the template and help me fix the problem? -- Ganeshk ( talk) 08:02, 3 November 2006 (UTC)
Does anyone know of an efficient way to get only the redirects to a certain page in python (in the pywikipedia package)? I'm using
referredPageTitle = wikipedia.input(u'Links to which page should be processed?') referredPage = wikipedia.Page(wikipedia.getSite(), referredPageTitle) gen = pagegenerators.ReferringPageGenerator(referredPage) gen = pagegenerators.RedirectOnlyPageGenerator(gen)
now, but it's really slow, and there has to be a better way. I created RedirectOnlyPageGenerator as
class RedirectOnlyPageGenerator: """ Wraps around another generator. Yields only those pages that are redirects. """ def __init__(self, generator): self.generator = generator def __iter__(self): for page in self.generator: if page.isRedirectPage(): yield page
— Mets501 ( talk) 19:14, 12 November 2006 (UTC)
No exactly for writing a bot here (more - a program :P), but I though some people here might know. How do you put pages on the watchlist? I've got wpWatchThis= in the post string, but does "true/false" or "on/off" go after the brackets. I'll try some testing if no-one knows. M a rtinp23 20:30, 18 November 2006 (UTC)
I'm wondering about the possibility of better-standardised edit summaries for bots and scripts; please leave any comments at Wikipedia talk:WikiProject User scripts#Arrows in edit summaries. -- ais523 13:16, 4 December 2006 ( U T C)
Hi, I am trying to develop a helpful little bot called PockBot but I've not got massive amounts of experience with Perl or Wikipedia. I am trying to make the bot post a new comment to a given page (after doing a bunch of stuff irrelevant to this problem). However, I have run up against all kinds of problems with edit tokens etc.
I understand that you need to use a GET request to get the form-field page, screen-scrape the edit token off it, and then use this in submitting a second GET request to post the actual comment. However, I've found that if I have the bot make the same HTTP GET request as I manually type in myself, I am presented with an edit token and the bot isn't and so when the bot tries to write its data to the page, on viewing the page post-edit the page is as before, ie the edit has been ignored.
Does anybody have any simple Perl code chunk for performing these actions? I presume that everyone must come up against this hurdle. I have tried looking at Pearle bot but couldn't get code chunks from that to work either.
I'm not a complete novice perl coder, but i'm not a coding expert either. Any help appreciated!
PocklingtonDan 17:48, 5 December 2006 (UTC)
Before I switched to pywikipedia and used Perl, I found a very neat module: HTML::Form. It can parse the HTML text and extract forms:
use HTML::Form; $form = HTML::Form->parse($html, $base_uri); $form->value(query => "Perl");
You can then modify the form any way you want (read the documentation to see how) and finally, it's "click" method provides a ready-to-use-by-a-user-agent HTTP request:
use LWP::UserAgent; $ua = LWP::UserAgent->new; $response = $ua->request($form->click);
Hope it helps. М иша 13 18:20, 5 December 2006 (UTC)
use HTML::Form; my $ua = LWP::UserAgent->new; my $response = $ua->get("http://en.wikipedia.org/?title=Category_talk:Roman_frontiers&action=edit§ion=new"); my $form = HTML::Form->parse($response); my $text = $form->find_input('wpTextbox1')->value; my $summary = $form->find_input('wpSummary')->value; my $save = $form->find_input('wpSave')->value; my $edittoken = $form->find_input('wpEditToken')->value; my $starttime = $form->find_input('wpStarttime')->value; my $edittime = $form->find_input('wpEdittime')->value; print "Content-type: text/html\n\n"; print "Text field: $text<br><br>"; print "Summary: $summary<br><br>"; print "Save: $save<br><br>"; print "Edit token: $edittoken<br><br>"; print "Start Time: $starttime<br><br>"; print "Edit Time: $edittime<br><br>"; exit;
Well, for starters, try my Perl bot framework, Perlwikipedia. That should give you all the code you need to edit pages, without all the mucking about with edit tokens. Shadow1 (talk) 00:55, 6 December 2006 (UTC)
Oh, by the way: As far as I can tell, the trailing slash is part of the edit token. Shadow1 (talk) 18:49, 6 December 2006 (UTC)
Incidentally, I have found the explanation of this mysterious backslash at the end of the edit token: r18112. Tizio 16:03, 10 December 2006 (UTC)
I've created this template as a more specialized version of {{ bot}} for AWB users. MaxSem 16:16, 30 December 2006 (UTC)