Operator: Jeff G. ( talk · contribs)
Time filed: 18:21, Wednesday February 23, 2011 ( UTC)
Automatic or Manually assisted: automatic posting as a part of manually initiated runs
Programming language(s): Python (latest v2.x version, 2.7.1 as of 2010-11-27)
Source code available: standard pywikipediabot (latest nightly build, weblinkchecker.py as last modified 2010-12-23, internally stamped "8787 2010-12-22 22:09:36Z")
Function overview: finds broken external links and report them to the talk page of the article in which the URL was found per m:Pywikipediabot/weblinkchecker.py. weblinkchecker.py creates two files (workfile deadlinks-wikipedia-en.dat and resultsfile results-wikipedia-en.txt), which would be manually distributed by the Operator (the first on request as a part of and after the discussion below and the second on /results-wikipedia-en.txt, a subpage of this page or User:JeffGBot/results-wikipedia-en.txt).
Links to relevant discussions (where appropriate):
Edit period(s): manual runs, at least once every two weeks
Estimated number of pages affected: all talk pages of article space pages with broken external links, using the default putthrottle of 10 seconds between posts (maximum 6 posts per minute)
Exclusion compliant (Y/N): N/A - this bot is not intended to touch user or user talk pages
Already has a bot flag (Y/N):
Function details: can be found at m:Pywikipediabot/weblinkchecker.py. With reference to some questions at Wikipedia:Bots/Requests for approval/PhuzBot and elsewhere, I have asked for some assistance at m:Talk:Pywikipediabot/weblinkchecker.py#Questions_from_BRFAs_and_elsewhere_on_English_Wikipedia.
Even were it modified to crawl every page querying prop=extlinks instead of downloading the page text, IMO it would still be better done from a database dump or a toolserver query. I also would like to know why you intend to post to article talk pages instead of applying {{ dead link}} directly to the page, and how this would interact with User:WebCiteBOT or other processes that provide archive links; for example, would it complain about dead links inBots that download substantial portions of Wikipedia's content by requesting many individual pages are not permitted. When such content is required, download database dumps instead.
|url=
for every {{
cite web|url=|archiveurl=}}
.
Anomie
⚔ 19:51, 23 February 2011 (UTC)
reply
{{
cite web|url=|archiveurl=}}
.
Anomie
⚔ 00:32, 24 February 2011 (UTC)
reply
{{ BAG assistance needed}} Are there any further questions, comments, or concerns? Thanks! — Jeff G. ツ 02:53, 23 March 2011 (UTC) reply
Given the number of dead links, I strongly suggest the bot to place {{ Dead link}} instead and attempt to fix them with Wayback/Webcite. Few people repair dead links in article, and I feel like posting them on talk page will be even more cumbersome. A couple of bots are already approved for that, though inactive. — HELLKNOWZ ▎ TALK 12:23, 3 April 2011 (UTC) reply
I started sharing the results file here on English Wikipedia as wikitext, but that has proven to be too cumbersome because of size (limited to 2MB) and spam filters, so I have instead started sharing both files via Windows Live SkyDrive
here. The results files are in sequential order. —
Jeff
G. ツ 18:48, 22 April 2011 (UTC)
reply
Operator: Jeff G. ( talk · contribs)
Time filed: 18:21, Wednesday February 23, 2011 ( UTC)
Automatic or Manually assisted: automatic posting as a part of manually initiated runs
Programming language(s): Python (latest v2.x version, 2.7.1 as of 2010-11-27)
Source code available: standard pywikipediabot (latest nightly build, weblinkchecker.py as last modified 2010-12-23, internally stamped "8787 2010-12-22 22:09:36Z")
Function overview: finds broken external links and report them to the talk page of the article in which the URL was found per m:Pywikipediabot/weblinkchecker.py. weblinkchecker.py creates two files (workfile deadlinks-wikipedia-en.dat and resultsfile results-wikipedia-en.txt), which would be manually distributed by the Operator (the first on request as a part of and after the discussion below and the second on /results-wikipedia-en.txt, a subpage of this page or User:JeffGBot/results-wikipedia-en.txt).
Links to relevant discussions (where appropriate):
Edit period(s): manual runs, at least once every two weeks
Estimated number of pages affected: all talk pages of article space pages with broken external links, using the default putthrottle of 10 seconds between posts (maximum 6 posts per minute)
Exclusion compliant (Y/N): N/A - this bot is not intended to touch user or user talk pages
Already has a bot flag (Y/N):
Function details: can be found at m:Pywikipediabot/weblinkchecker.py. With reference to some questions at Wikipedia:Bots/Requests for approval/PhuzBot and elsewhere, I have asked for some assistance at m:Talk:Pywikipediabot/weblinkchecker.py#Questions_from_BRFAs_and_elsewhere_on_English_Wikipedia.
Even were it modified to crawl every page querying prop=extlinks instead of downloading the page text, IMO it would still be better done from a database dump or a toolserver query. I also would like to know why you intend to post to article talk pages instead of applying {{ dead link}} directly to the page, and how this would interact with User:WebCiteBOT or other processes that provide archive links; for example, would it complain about dead links inBots that download substantial portions of Wikipedia's content by requesting many individual pages are not permitted. When such content is required, download database dumps instead.
|url=
for every {{
cite web|url=|archiveurl=}}
.
Anomie
⚔ 19:51, 23 February 2011 (UTC)
reply
{{
cite web|url=|archiveurl=}}
.
Anomie
⚔ 00:32, 24 February 2011 (UTC)
reply
{{ BAG assistance needed}} Are there any further questions, comments, or concerns? Thanks! — Jeff G. ツ 02:53, 23 March 2011 (UTC) reply
Given the number of dead links, I strongly suggest the bot to place {{ Dead link}} instead and attempt to fix them with Wayback/Webcite. Few people repair dead links in article, and I feel like posting them on talk page will be even more cumbersome. A couple of bots are already approved for that, though inactive. — HELLKNOWZ ▎ TALK 12:23, 3 April 2011 (UTC) reply
I started sharing the results file here on English Wikipedia as wikitext, but that has proven to be too cumbersome because of size (limited to 2MB) and spam filters, so I have instead started sharing both files via Windows Live SkyDrive
here. The results files are in sequential order. —
Jeff
G. ツ 18:48, 22 April 2011 (UTC)
reply