This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 50 | ← | Archive 53 | Archive 54 | Archive 55 | Archive 56 | Archive 57 | → | Archive 60 |
In relation to my earlier idea for an archiveurl bot, what about this: a bot that looks for dead link-tagged references, looks in the wayback machine, and if it finds the link stored there, it adds the archiveurl/archivedate stuff into the article, and detags the ref, else, if it doesn't, it posts some kind of comment about the ref not being accessible. Lukeno94 (tell Luke off here) 20:27, 24 May 2013 (UTC)
Seems like this is Impossible due to the vast number of transclusions involved.
Thegreatgrabber
(talk)
contribs 23:20, 24 May 2013 (UTC)
@ Thegreatgrabber: are you working on this? Theopolisme ( talk) 18:01, 28 May 2013 (UTC)
Currently this list must be manually updated, which is a tedious time-waster; and I believe it seems like the perfect task for a purpose-built little bot. I can't see it being very difficult to create one that measures the data and updates it. Can we have one for the page? — Preceding unsigned comment added by Doc9871 ( talk • contribs)
/* FROM */ 'watchlist', /* SELECT */ 'count(wl_user) AS num',
See this discussion. In a nutshell, we at WikiProject Medicine want to add a simple template to the top of all the talk pages that transclude certain infoboxes. It should run periodically to check for new articles; so, obviously, it should only add the template if it doesn't already exist. (Alternatively, it could keep a list of pages that it's already updated in a static store somewhere.)
I haven't written a bot before, but looking at Perl's Mediawiki::Bot modules, I think I could probably write this in a couple of hours. But, it strikes me this must be a very common pattern, and I'm wondering if there isn't a generic bot that can be used, or if there is an example someone could point me to that I could just copy-and-hack.
Thanks! Klortho ( talk) 01:23, 6 June 2013 (UTC)
Many project pages list articles with their "class" ratings – such as Vital articles. It would be nice to have a bot keep these up-to-date.
Some points:
|bot=yes
.Actually, I'd be willing to create this myself, but I don't know the first think about making bots... :-( Ypnypn ( talk) 03:41, 5 June 2013 (UTC)
User:B left the following message at WP:VPT, but nobody's responded to it.
Can we get a bot to go through and substitute the "migration" flag of images with the {{ GFDL}} template? Right now, today, if someone uploads a {{ GFDL}} image, it shows the {{ License migration}} template. Now that it is four years after the license migration, I think it makes sense to change the default to be not eligible for migration. But in order to change the default, we would need a bot to do a one-time addition of something like migration=not-reviewed to all existing uses of the GFDL template. So if you have {{ GFDL}} or
{{ self|GFDL}}
, the template would add "migration=not-reviewed" to the template. After that is done, we can make the default "not-eligible" instead of the default being pending review. -- B ( talk) 00:16, 2 June 2013 (UTC)
I thoroughly agree with this idea; the template needs to be modified so that it defaults to migration=not-eligible, and we need to mark existing examples as needing review. This wouldn't be the first bot written to modify tons of license templates; someone's addition of "subject to disclaimers" to {{ GFDL}} (several years ago) forced Commons to maintain two different GFDL templates, and they had to replace many thousands of {{ GFDL}} transclusions. Nyttend ( talk) 03:55, 4 June 2013 (UTC)
|migration=
not-reviewed
to all transclusions of {{
GDFL}} in every transclusion, provided that there was no value set for |migration=
?
Hazard-SJ
✈ 05:01, 5 June 2013 (UTC)
I'm removing superfluous redlinks from country outlines.
To remove the redlinks of "Political scandals of foo" (where foo is a country name) from each country outline, I take a list of country outlines (on a user subpage) and replace "Outline of" with "Political scandals of". That list now shows which links are red.
Then I use AWB's listmaker to make a list of all the redlinks on that page, and I paste them back in to the same page and linkify them. Now there's only redlinks on that page.
Next I change "Political scandals of" back to "Outline of". The list is now all the country outlines that contain the "Political scandals of" redlink in them.
I make a list in AWB from that, and then do a regex search/replace with AWB to get rid of the entry in each of those outlines.
Unfortunately, I have to do this whole procedure over again for each redlink I wish to nuke (so as not to nuke the blue links too), and this approach only works with sets of links that share the "of foo" nomenclature. Because AWB has no way to search/replace strings based on redlink status (as far as I know).
If you know of a easier/faster way to do this, please let me know. The Transhumanist 06:18, 19 May 2013 (UTC)
I added a magic entry to {{ Outline City+}}, and it turned invisible. Not a big problem, but once the whole template is converted, it will make it look rather sparse upon first inspection. :) The Transhumanist 07:55, 20 May 2013 (UTC)
Some of the items have section links. Will #ifexist check the existence of sections? The Transhumanist 08:08, 20 May 2013 (UTC)
{{#ifexist:Main page#Nonsense section name|yes|no}}
⇒ yes --
John of Reading (
talk) 06:24, 21 May 2013 (UTC)Manually stripping redlinks from outlines is a royal pain in the...
See Outline of Guam, for example.
I desperately need help from someone with a bot (or help building one) that can do all of the following:
Remove each bullet list entry that has no offspring and that is entirely comprised of a redlink (like "Flora of Guam", below)
but only delink (remove the double square brackets from) those that have offspring (like Wildlife of Guam and Fauna of Guam, above. (By "Offspring", I mean one or more (non-red) subitems). So the above structure should end up looking like this:
If a redlink entry has an annotation or descriptive prose after it, it should be delinked rather than deleted. Here are some examples from
Outline of Guam:
These should end up looking like this:
Also, "main" article entries that are red must be removed. Ones that look like this:
But, if they have any bluelinks in them, only the redlinks should be removed. So...
...should be made to look like this:
If a section is empty or is made empty due to all of its material being deleted, its heading must be deleted as well. (See
Outline of Guam#Local government in Guam. This heading will have no content once its red "main" entry is removed, and so this heading needs to be removed too.)
Many outlines have had redlinks sitting in them for years. They need to be cleaned up. There are so many outlines, this is infeasible to do manually.
Are there any bots that can do this?
If not, how would a script check a link on a page it was processing for whether or not the link was a redlink?
I look forward to your replies. The Transhumanist 00:37, 20 May 2013 (UTC)
@ The Transhumanist and Theopolisme:According to BAG, this Needs wider discussion..— cyberpower ChatOnline 18:55, 7 June 2013 (UTC)
The request is for an automated bot that scans through Category:Non-free images for NFUR review and attempts to automatically add a preformatted NFUR rationale when one is not is present.
This bot would not apply to all Non-free content and would be limited initally to {{ Non-free album cover}}{{ Non-free book cover}} {{ Non-free video cover}} and {{ Non-free logo}} tagged media where the image is used in an Infobox. Essentially this bot would do automatically, what I've been doing extensively in a manual fashion with FURME
In adding the NFUR the bot would also (having added a rationale) also add the |image_has rationale=yes param as well as leaving an appropriate note that the rationale was autogenerated.
By utilising a bot to add the types of rationale concerned automatically, contributer and admin time can be released to deal with more complex FUR claims, which do not have easy pre-formatted rationales or which require a more complex explanation.
Sfan00 IMG ( talk) 14:28, 4 June 2013 (UTC)
Here's my general outline of what it looks like the bot will need to do:
For all files in Category:Non-free images for NFUR review: If image meets the following constraints: - tagged with {{Non-free album cover}}{{Non-free book cover}}{{Non-free video cover}}{{Non-free logo}} - only used in one article - file must be the only non-free file in the article Then: - on the image page: - add some fairuse rationales to {{Non-free use rationale}} or {{Non-free use rationale 2}} - *** I will need rationales to insert *** - add "|image has rationale=yes" to {{Non-free album cover}}{{Non-free book cover}}{{Non-free video cover}}{{Non-free logo}} - add a new parameter "bot=Theo's Little Bot", to {{Non-free album cover}}{{Non-free book cover}}{{Non-free video cover}}{{Non-free logo}} - this might need additional discussion as far implementation/categorization
As you can see, there are still some questions -- #1, are there rationales prewritten that I can use (
Wikipedia:Use_rationale_examples, possibly...)? Secondly, as far as clarifying that it was reviewed by a bot, I think adding |bot=
parameters to {{
non-free album cover}} and such would be easy enough, although if you have other suggestions I'm all ears.
Theopolisme (
talk) 06:53, 7 June 2013 (UTC)
is a straight translation (and partial extension) of what FURME does, substituting the {{ Non-free use rationale}} types for the {{ <blah> fur}} types it uses currently. FURME itself needs an overhaul and re-integration into TWINKLE, but that would be outside the scope of a bot request.
still needs to be tweaked, but in essence it uses the |bot=
and |reviewed=
style as opposed to modification of
the license template. Note this also means it's easier to remove the tag once it's been human reviewed.
Sfan00 IMG (
talk) 11:10, 7 June 2013 (UTC)
Hi! I am looking for a bot that could update Wikipedia:Adopt-a-user/Adoptee's Area/Adopters's "currently accepting adoptees" to "not currently available" if they haven't made any edits after a certain period of time. This is because of edits like this where new user asks for adopters, but because the adopter has gone from Wikipedia, they never get on and just leave Wikipedia. Thanks, and if you need more clarification just ask! jcc ( tea and biscuits) 17:14, 6 June 2013 (UTC)
Heh, it's me again, with more "archive" bot requests. Here's another simple two: 1: If a |url= part of a reference has the web.archive.org) string in it, remove it and strip it back to the proper URL link. 2: If a reference has a |archiveurl tag, but is lacking in a |archivedate tag, grab the archiving date from the relevant part of the archive url, e.g http://web.archive.org/web/20071031094153/http://www.chaptersofdublin.com/books/Wright/wright10.htm would have "20071031" grabbed and formatted to "2007/10/31". [4] shows where I did this sort of thing manually. Lukeno94 (tell Luke off here) 20:49, 6 June 2013 (UTC)
The Wikipedia:Most missed articles -- often searched for, nonexistent articles -- has not been updated since a batch run in 2008. The German Wikipedia person, Melancholie (de:Benutzer:Melancholie) who did the batch run has not been active since 2009. Where would be a good place to ask for someone with expertise to do another run? It does not seem to fit the requirements of Wikipedia:Village pump (technical) since it is not a technical issue about Wikipedia. It is not a new proposal, and not a new idea. It is not about help using Wikipedia, and it is not a factual WP:Reference Desk question. I didn't find a WikiProject that looked promising. So I am asking for direction here. -- Bejnar ( talk) 19:52, 11 June 2013 (UTC)
Add {{ Hampton, Virginia}} to every page in Category:Neighborhoods in Hampton, Virginia. Emmette Hernandez Coleman ( talk) 23:09, 17 June 2013 (UTC)
This bot's job would be to stick {{ unreliable source}} next to references that are links to websites of sketchy reliability as third-party sources (e.g., blog hosting sites, tabloids, and extremely biased news sites). The list of these sites would be a page in the MediaWiki namespace, like MediaWiki:Spam-blacklist. In order to let through false positives (for instance, when the site is being cited as a primary source), the bot would add a hidden category (maybe Category:Unreliable source tags added by a bot) after the template to identify it as being added by the bot, and enable editors to check the tags. The hidden category would be removed by the editor if it was an accurate tagging. If not, the editor would comment out the {{ unreliable source}} tag, which would mean that it would be skipped over by the bot in the future. ❤ Yutsi Talk/ Contributions ( 偉特 ) 14:16, 13 June 2013 (UTC)
{{
unreliable source|bot=AwesomeBot}}
, however how you construct your url blacklist is probably most important. For example, examiner.com is on the blacklist because its just overall not a reliable source. However there are
many links to it, and those have all probably been individually whitelisted
citation needed. So tagging those wouldn't be useful. Did you have a few example domains in mind? That would help in evaluating your request.
Legoktm (
talk) 16:54, 13 June 2013 (UTC)
I'm a lawyer. One feature I really like is the ability to enter in a cases legal citation, like "388 U.S. 1" (the legal citation for Loving v. Virginia), as the "title" of a Wikipedia article, and have that entry automatically redirect to the correct page. However, this only exists for Supreme Court cases up to somewhere in the 540th volume of the U.S. reporter, and (as far as I know), not for any other cases.
It would be great if a bot could automatically create redirects between a legal citation and a page about that case. — Preceding unsigned comment added by Jmheller ( talk • contribs) 03:42, 16 June 2013 (UTC)
@ Jmheller: What should be done for cases like [7] where cases are only listed by docket number? Thegreatgrabber (talk) contribs 02:21, 21 June 2013 (UTC)
I just found out that the links to the GeoWhen database are dead. However, through a web search I discovered that the pages are still accessible – just add bak/
after the domain http://www.stratigraphy.org
and before the geowhen/
part. Some links have been fixed already, but not all. Note: An additional mirror is found under
http://engineering.purdue.edu/Stratigraphy/resources/geowhen/. --
Florian Blaschke (
talk) 16:03, 14 June 2013 (UTC)
Wikipedia:Categories for discussion/Working/Manual#Templates removed or updated - deletion pending automatic emptying of category has a very large backlog of hidden categories that have been renamed. Due to the way some or all of these categories are generated by the template the job queue alone doesn't seem able to process them and individual articles require null edits to get the updated category. Is it possible for a bot to have a crack at these? Timrollpickering ( talk) 16:10, 21 June 2013 (UTC)
This one should be straightforward:
Et voila. :) Lukeno94 (tell Luke off here) 14:03, 22 June 2013 (UTC)
This is a request for assistance in moving the assessment categories for WikiProject Rational Skepticism. Greg Bard ( talk) 20:02, 23 June 2013 (UTC)
Fixed all categories manually, updating all pages with my bot. -- Magioladitis ( talk) 21:45, 23 June 2013 (UTC)
I deleted all old categories, I fixed/normalised all banners and user wikiproject tags. -- Magioladitis ( talk) 23:25, 23 June 2013 (UTC)
Is there any way that a bot can go through and find instances where an article has the {{ unreferenced}} template and {{ references}}/<Reflist>/any other coding pertaining to references? I've seen a lot of instances of {{ unreferenced}} being used on articles that do have references. This seems like it should be an easy fix. Ten Pound Hammer • ( What did I screw up now?) 03:35, 20 June 2013 (UTC)
<ref>...</ref>
tags can contain notes instead of references, you might want to limit your search to citation templates, such as {{
cite web}} and {{
cite news}}.
GoingBatty (
talk) 02:52, 22 June 2013 (UTC)
<ref>...</ref>
tags to contain a note:
Battle of Breitenfeld (1642).
GoingBatty (
talk) 12:22, 22 June 2013 (UTC)
Do like I did here in disambiguation pages per WP:DABSTYLE. -- Magioladitis ( talk) 15:39, 23 June 2013 (UTC)
Hi! Is anyone interested in writing a bot that is used to articles to the Sorani Kurdish Wikipedia (CKB) about Iraqi cities using census data?
I found Iraqi census data at http://cosit.gov.iq/pdf/2011/pop_no_2008.pdf ( Archive) and the idea is something like User:Rambot creating U.S. cities with census data
Thanks WhisperToMe ( talk) 00:11, 26 June 2013 (UTC)
I was wondering if anyone would be able to create a bot that would be able to copy the information about planets detected by the Kepler spacecraft from the Extrasolar Planets Encyclopaedia ( link) to our list of planets discovered using the Kepler spacecraft. Rather than merely going to the list, it would be ideal if the bot could follow the link for each Kepler planet and get the full information from there, rather than merely looking at the catalog. The information in the EPE about Kepler planet is in turn copied from the Kepler discoveries catalog, which is in the public domain but is unfortunately offline at the moment (requiring us to use the copyrighted EPE. In addition to the basic information, I would like it if our bot were able to calculate surface gravity where possible based on mass/(r^2). Thanks, Wer900 • talk 18:08, 27 June 2013 (UTC)
Hi. I haven't historically been a big editor on Wikipedia. Though I use it from time to time. I realize that there are probably a number of bots at work using various methods to target entries for improvement. However, I just wanted to add my two cents on a method which may or may not be in use.
First, however, some quick background. I am currently taking a Data Science class and for one of my assignments I developed a script which selects a random Wikipedia article and does the following:
1) Counts total words and total sources (word count does not include 'organizational' sections such as References, External Links etc.
2) Uses all words contributing to the word count to assess the overall sentiment of the text. For this, I used the AFINN dictionary and the word count to get an average sentiment score per word.
3) For each section and sub-section (h2 and h3) in the page which is not organizational in nature (see above definition) counts the number of words, citations and as with item 2 gets a sentiment score for the section/sub-section
So my thought on using this script is as follows:
If it was used to score a large number of Wikipedia pages, we could come up with some parameters on which a page and its sections and subsections could be scored.
1) For all articles, word count, source count and sentiment score. 2) For all sections and sub-sections, word count, citation count and sentiment score. 3) For pages with sources, a sources per word score 4) For sections with citations, a words per citation score
For all of these parameters, the scores from the sample could be used to determine what sort of statistical distribution they follow. A bot could then scan the full set of wikipedia articles and flag those which are beyond some sort of tolerance limit.
Additionally, data could be collected for sections which commonly occur (Early Life, Private Life, Criticisms etc.) to establish expected distributions for those specific section types. For example, we might expect the sections labeled Criticisms would, on average, have a more negative sentiment than other sections.
I hope this all makes sense and perhaps some or all of it is being done. I look forward to hearing from some more experienced Wikipedians on the idea. Additionally, for sections which com — Preceding unsigned comment added by 64.134.190.157 ( talk) 18:46, 21 June 2013 (UTC)
Put {{ South Alexandria}} on every article listed on it, and put the listed articles in a category called Category:South Alexandria, Virginia. Emmette Hernandez Coleman ( talk) 09:20, 29 June 2013 (UTC)
Any chance someone could write a bot to license tag these quickly?
[6=1&templates_no=db-i3&sortby=uploaddate&ext_image_data=1&file_usage_data=1]
Most seem to be reasonably straightforward Sfan00 IMG ( talk) 16:20, 18 June 2013 (UTC)
etc... There may be some others, but this will start to clear out the 5000 or so I've found on the catscan query notes over on WP:AWB. Sfan00 IMG ( talk) 10:14, 23 June 2013 (UTC)
Back in March there was consensus in Proposals to test out Template:COI editnotice on the Talk page of articles about extant organizations, to see if it increases the use of {{ Request edit}} and reduces COI editing. A bot was approved in April to apply the template to Category:Companies based in Idaho. The BOT request said it would effect 1,000+ articles, which would be enough of a sample to test, but it looks like it was only applied to about 40 articles? I am unsure if the bot was never run on the entire category, or if we need a larger category. The original bot-runner is now retired. Any help would be appreciated. CorporateM ( Talk) 14:12, 30 June 2013 (UTC)
I know that previously PhotoCatBot has tagged articles fitting this criteria, but is there any way that we could get a new bot to help finish up where this bot left off almost three years ago? Thanks! Kevin Rutherford ( talk) 02:16, 1 July 2013 (UTC)
Make sure all articles listed on {{ Arlington County, Virginia}} have the template, and are in Category:Neighborhoods in Arlington County, Virginia.
Create redirects to these articles in the format "X, Virginia" and "X, Arlington, Virginia" and "X". For example all of the following should redirect to Columbia Forest Historic District: Columbia Forest, Virginia, Columbia Forest, Arlington, Virginia, and Columbia Forest, Columbia Forest Historic District, Virginia and Columbia Forest Historic District, Arlington, Virginia. Emmette Hernandez Coleman ( talk) 10:41, 3 July 2013 (UTC)
Howdy, I haven't had the need for a bot before, but I'm organizing a meetup and would like help in posting invites to the folks on this list. I can come up with a short message, is that all I need to provide? Or is there anymore info needed? Thanks, Olegkagan ( talk) 00:44, 27 June 2013 (UTC)
Okay, and it would be nice to have the message to be added in case a trial is requested of me, thanks. Hazard-SJ ✈ 02:19, 3 July 2013 (UTC)
Please could somebody "Subst:" all 84 article-space transclusions of the German-language {{ Infobox Unternehmen}}, which is now a wrapper for the English-language {{ Infobox company}}, as in this edit? That may be a job for AWB, which unfortunately I can't use on my small-screen netbook. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:08, 10 July 2013 (UTC)
Removes all articles in {{
Peconic County, New York}} that are NOT in the following categories:
Category:Sag Harbor, New York,
Category:Riverhead (town), New York,
Category:Shelter Island (town), New York,
Category:Southampton (town), New York,
Category:Southold, New York.
Emmette Hernandez Coleman (
talk) 04:13, 12 July 2013 (UTC)
Or arterially remove all articles that ARE in the flowing categories:
Category:Babylon (town), New York,
Category:Brookhaven, New York,
Category:Huntington, New York,
Category:Islip (town), New York,
Category:Smithtown, New York.
Emmette Hernandez Coleman (
talk) 04:21, 12 July 2013 (UTC)
Never-mind. The template might be deleted, so no point in putting that effort into it until we know it will be kept. Emmette Hernandez Coleman ( talk) 09:08, 13 July 2013 (UTC)
Given NoomBot is under since April, maybe another bot could be made to make\update the ever-useful Book Reports. igordebraga ≠ 01:31, 13 July 2013 (UTC)
Add {{ Orleans Parish, Louisiana}} to every page it lists. Emmette Hernandez Coleman ( talk) 22:33, 13 July 2013 (UTC)
Also {{ Neighborhoods of Denver}}. At the rate I'm going I'll probably create a few more navboxes in the next few days, so it would be easier to do a bunch of navboxes together don't bother with either of these yet. Emmette Hernandez Coleman ( talk) 08:20, 14 July 2013 (UTC)
I've just added an
hCalendar microformat to {{
Infobox poem}}, so a Bot is now required, to apply {{
Start date}} to the |publication_date=
parameter.
The logic should be:
This is related to a larger request with clear community consensus, which has not yet been actioned; I'm hoping that this smaller, more manageable task, will attract a response, which can later be applied elsewhere. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:12, 14 July 2013 (UTC)
df=y
especially important? At the moment, I haven't implemented it, since according to my tests it would slow down the script a fair bit.
Theopolisme (
talk) 17:09, 14 July 2013 (UTC)
I would like to suggest a bot that disables autoblock on softblocks - if an admin accidentally enables autoblock on a softblock, my suggested bot will automatically DISable autoblock and enable account creation, e-mail, and talk page access if any, some or all of them are disabled. The bot will then lift the autoblock on the IP address affected by the soft-blocked user. 76.226.117.87 ( talk) 03:09, 15 July 2013 (UTC)
I already said that I wanted a bot to change block settings (see the declined request made by 76.226.117.87), but now I have fixed it:
The process goes like this: 1. An administrator ENables autoblock when blocking an account that should have autoblock DISabled. 2. The bot I'm suggesting will change the block settings for the affected account so that account creation, e-mail, and talk page access are all allowed, if any of them are disabled. The bot will also disable autoblock on the affected account, and lift the autoblock on the IP address recently used by the blocked account. 3. The resulting log entry should look something like this (timestamps will vary, and some extra comments are added that are not normally present in block log entries):
I thought of a bot that corrected simple spelling errors, such as beacuse-because and teh-the. buff bills 7701 22:17, 15 July 2013 (UTC)
Context: Citing references is important for medical articles (under WP:MED or related WikiProjects) though it is important for other articles as well. As books have ISBN, almost all the renowned medical journals have Pubmed listing and their articles bear a PMID. Pubmed serves as a global quality regulatory and enlisting body for medical articles and if a medical article does not have a PMID, chances are that the journal is not a popular one and therefrore there is a possibiliy that it does not maintain quality issues that are to be adhered to. Other medical articles have Digital object identifier or DOI (with or without having PMID) which serves to provide a permanent link redirecting to its present url. Some Pubmed articles are freely accessible and have PMC (alongside PMID) which is therefore an optional parameter. Thus, if a <ref></ref> is having neither PMID nor DOI, chances ar that 1. the article has a PMID (most cases) but the <ref></ref> tag lacks its mention or 2. The article lacks a PMID or DOI and its not a problem of the <ref></ref> placed.
I feel the requirement for two different bots.
Utility: These bots would enable the editors of medical articles to make the mentioned references more reliable and verifiable and would encourage users to use this template while placing references. Diptanshu Talk 16:15, 10 July 2013 (UTC)
The folks at the Village Pump suggested that I post this request here.
I have been correcting errors in citations, and I have noticed that pretty much every Russia-related article I edit contains an incorrectly formatted parameter, "language=ru", in its citation parameters. (The "language" parameter takes the full language name, not the two-letter code.)
You can see an example of one of these citations here. Note that the rendered reference says "(in ru)" instead of the correct "(in Russian)".
It occurred to me that someone clever with a script or bot or similar creature might be able to do a semi-automated find and replace of "language=ru" in citations with "language=Russian".
Is this a good place to request such a thing? I do not have the time or skills to take on such a project myself. Thanks. Jonesey95 ( talk) 14:06, 16 July 2013 (UTC)
In the pump thread, using Lua to automatically parse the language codes was suggested -- I think that would definitely be preferable if possible, rather than having a bot make a boatload of fairly trivial edits. Theopolisme ( talk) 14:27, 16 July 2013 (UTC)
Wikipedia is too important and too useful of a resource to have citations behind paywalls if there is another possible reference. In order to draw attention to references that need improving/substitution it would be nice if there was a bot that would tag articles that are behind a paywall. I realize that some newspapers slowly roll articles behind a paywall as time passes. However other newspapers have all their content behind a paywall. A good example is The Sunday Times. You can click on any link on http://www.thesundaytimes.co.uk and you will be presented with "your preview of the sunday times." If wikipedians like myself enjoy contributing by verifying citations it is next we cant verify sunday times citations for free. When I see a paywall tagged citation I often try to find another citation and substitute it. A bot would be helpful for this. DouglasCalvert ( talk) 23:51, 16 July 2013 (UTC)
WMF has turned Visual Editor on for IP accounts now, and the results are as expected: Filter 550 shows that a significant volume of articles are getting mutilated with stray wikitext. It has been proposed to set the filter to block the edits that include nowikis that indicate that the combination of VE and an unknowing editor has caused a problem.
I'd personally rather see this sort of mess cleaned up by a bot, and a message left on a user talk page that asks the editor either to stop using wiki markup or to stop using VE. I think it's possible to detect strings generated by VE (basically, it surrounds wikitext with nowiki tags), and figure out what the fix is (in some [most?] cases, just remove the nowiki tags), similar to how User:BracketBot figures out that an edit has broken syntax. Given that the problem is on the order of magnitude of 300 erroneous edits per day, is it possible to move with all deliberate pace to field such a bot?
(Background: see WP:VPR#Filter 550 should disallow.) -- John Broughton (♫♫) 03:31, 16 July 2013 (UTC)
<nowiki>...</nowiki>
tags in main namespace articles, and suggest a fix for each of them. Basically suggesting to just remove the tags, except for specific cases. I've only one in mind for now : the nowiki at the beginning of a line with whitespace characters after it, the whitespace characters should be removed too.<nowiki>...</nowiki>
in main namespace and suggest a fix. To active this detection, edit
Special:MyPage/WikiCleanerConfiguration and add the following contents (with the <source>...</source>
tags):# Configuration for error 518: nowiki tags
error_518_bot_enwiki=true END
<nowiki>...</nowiki>
tags are found and suggestions are given to fix them. It's quite basic, so if you think of any enhancement, tell me. --
NicoV (
Talk on frwiki) 22:52, 17 July 2013 (UTC)
I came here from the WP:Village Pump (proposals) page, and I suggest instead of an autofix bot, maybe a bot much like User:DPL bot? They could notify everyone who accidentally triggered the filter and each person could go back and fix it. Unless that would create lots of spam? Just a thought. kikichugirl inquire 22:21, 20 July 2013 (UTC)
It seems to me that VE is leading to a rise in external links [9] in article text, as refs. Can an autobot move the link to refs or EL with an edit note for a human to follow-up? Thanks. Alanscottwalker ( talk) 10:48, 23 July 2013 (UTC)
DASHBot ( talk · contribs) used to create, and/or periodically update, a list of unreferenced biographies of living persons for a given Wikiproject (see User:DASHBot/Wikiprojects). However, that bot has been blocked since March. I'm wondering if another one can accomplish this same task. I'm asking on behalf of WP:JAZZ (whose list is at Wikipedia:WikiProject Jazz/Unreferenced BLPs) but there were a lot of other WikiProjects on that list, as well (I'd already removed WP:JAZZ, though). -- Gyrofrog (talk) 21:55, 23 July 2013 (UTC)
I have noticed that some links contain Google Analytics tracking parameters. It seems that wikipedia links should not help companies track user clicks. I have even noticed some advert-like entries about companies contain links with custom GA campaign parameters to target clicks from wikipedia to the company page/bog/etc. Removing these GA tracking parameters seems like a great task for a bot. All google analytics parameters begin with utm. The basic parameters are:
Does this sound doable?
DouglasCalvert ( talk) 03:56, 13 July 2013 (UTC)
utm_
parameter, correct?
Theopolisme (
talk) 21:56, 13 July 2013 (UTC)I've
written the basic program, and it successfully made
this edit (incorporating the canonical url) -- if the canonical url isn't available, it'll just use
regular expression to strip the utm_
parameters. Douglas, is this what you were looking for?
Theopolisme (
talk) 01:33, 14 July 2013 (UTC)
@ Hazard-SJ: could I steal the source code that you used to generate that list of urls (I assume it involved a db replica somewhere or other, they're all basically the same *wink*)? That way I won't have to manually crawl every...single...page... Theopolisme ( talk) 01:36, 14 July 2013 (UTC)
Could the same task be useful in trimming cruft from Google Books links, like this edit? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:30, 17 July 2013 (UTC)
Anyone? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:12, 21 July 2013 (UTC)
page
, for example, definitely needs to stay -- are there others? I wouldn't want to remove something that would alter the appearance of the page for the person who clicks on the link (besides, say, removing extraneous search terms and such).
Theopolisme (
talk) 00:35, 22 July 2013 (UTC)
It seems like it would be of great use to create a bot that finds and lists the most vandalised pages on the wiki, and create a list article or essay that regularly updates the list, in order to alert all wikipedians to which pages require the most monitoring and reverting. Superwikiwalrus ( talk) 14:12, 27 July 2013 (UTC)
Hi! This is my first bot request so bear with me. The bot I am requesting will do/perform the following functions:
Does anyone here have any thoughts on the feasibility of this? ★★ KING RETROLORD★★ 09:10, 22 July 2013 (UTC)
One more request, could the bot check there is at a minimum 1 citation per paragraph. Thanks, ★★ KING RETROLORD★★ 09:59, 22 July 2013 (UTC)
(==|Category:|\[\[File:|\[\[Image:|{{Authority control|{{.*}}|^<.*?>|^;|^\|)
. It's still not completely foolproof, though, and still gets some false positives.
Theopolisme (
talk) 04:10, 23 July 2013 (UTC)
Yes, when running "for real" the bot will report on all current nominations. As far as where the reports end up... I would like to just use sections on User:Theo's Little Bot/GAN, since that prevents having to creating a ton of new pages—unless you have a reason why multiple pages would be beneficial. Theopolisme ( talk) 05:09, 23 July 2013 (UTC)
{{
User:Theo's Little Bot/GAN/link}}
can be used to automatically link to a specific article's listing.
Theopolisme (
talk) 06:15, 23 July 2013 (UTC)@ GoingBatty: I've implemented basic spell checking using Wikipedia:Lists of common misspellings/For machines ( commit). I initially tried using a larger corpus (courtesy of NLTK+Project Gutenberg), but it was taking way too long to process each article (5-8 minutes), so I settled for Wikipedia:Lists of common misspellings/For machines instead. It's not as complete, but should still catch "common misspellings." ;) Your thoughts? Is this adequate? Theopolisme ( talk) 19:17, 23 July 2013 (UTC)
@ Theopolisme: Can this bot be used to scan current Good Articles? The bot might be able to select some articles that might no-longer meet the GA criteria for examination by human users. If the user decides it no-longer meets the criteria, he can open a GAR.-- FutureTrillionaire ( talk) 01:35, 25 July 2013 (UTC)
Looks like the bot is done. However, there are issues. I checked the about 20 of the articles listed, and it appears that the reason the vast majority of these articles were selected is due to having at least one dead link tag in the article. However, this is not very useful because dead links do not violate any of the GA criteria. I saw only one article that contained an orange tag, and few articles only containing citation needed tags or disambiguation needed tags. Is it possible for the bot to ignore dead link tags and other less serious tags? I was hoping to just see articles with orange tags displayed at the top, or something like that.-- FutureTrillionaire ( talk) 01:50, 27 July 2013 (UTC)
{{
Ambox}}
(that's "the orange" you were talking about). Thoughts? Thanks for bearing with me on this. (Another note: for some reason, the bot listed articles from least->most tags...fixed.)
Theopolisme (
talk) 02:33, 27 July 2013 (UTC)
{{
Current}}
and {{
Split}}
.--
FutureTrillionaire (
talk) 02:59, 27 July 2013 (UTC)
{{
Current}}
and {{
Split}}
wouldn't be included, since they aren't also in
Category:Cleanup templates. Here's a
page to enter templates for the whitelist, though, should you stumble upon anything.
Theopolisme (
talk) 03:39, 27 July 2013 (UTC)
This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 50 | ← | Archive 53 | Archive 54 | Archive 55 | Archive 56 | Archive 57 | → | Archive 60 |
In relation to my earlier idea for an archiveurl bot, what about this: a bot that looks for dead link-tagged references, looks in the wayback machine, and if it finds the link stored there, it adds the archiveurl/archivedate stuff into the article, and detags the ref, else, if it doesn't, it posts some kind of comment about the ref not being accessible. Lukeno94 (tell Luke off here) 20:27, 24 May 2013 (UTC)
Seems like this is Impossible due to the vast number of transclusions involved.
Thegreatgrabber
(talk)
contribs 23:20, 24 May 2013 (UTC)
@ Thegreatgrabber: are you working on this? Theopolisme ( talk) 18:01, 28 May 2013 (UTC)
Currently this list must be manually updated, which is a tedious time-waster; and I believe it seems like the perfect task for a purpose-built little bot. I can't see it being very difficult to create one that measures the data and updates it. Can we have one for the page? — Preceding unsigned comment added by Doc9871 ( talk • contribs)
/* FROM */ 'watchlist', /* SELECT */ 'count(wl_user) AS num',
See this discussion. In a nutshell, we at WikiProject Medicine want to add a simple template to the top of all the talk pages that transclude certain infoboxes. It should run periodically to check for new articles; so, obviously, it should only add the template if it doesn't already exist. (Alternatively, it could keep a list of pages that it's already updated in a static store somewhere.)
I haven't written a bot before, but looking at Perl's Mediawiki::Bot modules, I think I could probably write this in a couple of hours. But, it strikes me this must be a very common pattern, and I'm wondering if there isn't a generic bot that can be used, or if there is an example someone could point me to that I could just copy-and-hack.
Thanks! Klortho ( talk) 01:23, 6 June 2013 (UTC)
Many project pages list articles with their "class" ratings – such as Vital articles. It would be nice to have a bot keep these up-to-date.
Some points:
|bot=yes
.Actually, I'd be willing to create this myself, but I don't know the first think about making bots... :-( Ypnypn ( talk) 03:41, 5 June 2013 (UTC)
User:B left the following message at WP:VPT, but nobody's responded to it.
Can we get a bot to go through and substitute the "migration" flag of images with the {{ GFDL}} template? Right now, today, if someone uploads a {{ GFDL}} image, it shows the {{ License migration}} template. Now that it is four years after the license migration, I think it makes sense to change the default to be not eligible for migration. But in order to change the default, we would need a bot to do a one-time addition of something like migration=not-reviewed to all existing uses of the GFDL template. So if you have {{ GFDL}} or
{{ self|GFDL}}
, the template would add "migration=not-reviewed" to the template. After that is done, we can make the default "not-eligible" instead of the default being pending review. -- B ( talk) 00:16, 2 June 2013 (UTC)
I thoroughly agree with this idea; the template needs to be modified so that it defaults to migration=not-eligible, and we need to mark existing examples as needing review. This wouldn't be the first bot written to modify tons of license templates; someone's addition of "subject to disclaimers" to {{ GFDL}} (several years ago) forced Commons to maintain two different GFDL templates, and they had to replace many thousands of {{ GFDL}} transclusions. Nyttend ( talk) 03:55, 4 June 2013 (UTC)
|migration=
not-reviewed
to all transclusions of {{
GDFL}} in every transclusion, provided that there was no value set for |migration=
?
Hazard-SJ
✈ 05:01, 5 June 2013 (UTC)
I'm removing superfluous redlinks from country outlines.
To remove the redlinks of "Political scandals of foo" (where foo is a country name) from each country outline, I take a list of country outlines (on a user subpage) and replace "Outline of" with "Political scandals of". That list now shows which links are red.
Then I use AWB's listmaker to make a list of all the redlinks on that page, and I paste them back in to the same page and linkify them. Now there's only redlinks on that page.
Next I change "Political scandals of" back to "Outline of". The list is now all the country outlines that contain the "Political scandals of" redlink in them.
I make a list in AWB from that, and then do a regex search/replace with AWB to get rid of the entry in each of those outlines.
Unfortunately, I have to do this whole procedure over again for each redlink I wish to nuke (so as not to nuke the blue links too), and this approach only works with sets of links that share the "of foo" nomenclature. Because AWB has no way to search/replace strings based on redlink status (as far as I know).
If you know of a easier/faster way to do this, please let me know. The Transhumanist 06:18, 19 May 2013 (UTC)
I added a magic entry to {{ Outline City+}}, and it turned invisible. Not a big problem, but once the whole template is converted, it will make it look rather sparse upon first inspection. :) The Transhumanist 07:55, 20 May 2013 (UTC)
Some of the items have section links. Will #ifexist check the existence of sections? The Transhumanist 08:08, 20 May 2013 (UTC)
{{#ifexist:Main page#Nonsense section name|yes|no}}
⇒ yes --
John of Reading (
talk) 06:24, 21 May 2013 (UTC)Manually stripping redlinks from outlines is a royal pain in the...
See Outline of Guam, for example.
I desperately need help from someone with a bot (or help building one) that can do all of the following:
Remove each bullet list entry that has no offspring and that is entirely comprised of a redlink (like "Flora of Guam", below)
but only delink (remove the double square brackets from) those that have offspring (like Wildlife of Guam and Fauna of Guam, above. (By "Offspring", I mean one or more (non-red) subitems). So the above structure should end up looking like this:
If a redlink entry has an annotation or descriptive prose after it, it should be delinked rather than deleted. Here are some examples from
Outline of Guam:
These should end up looking like this:
Also, "main" article entries that are red must be removed. Ones that look like this:
But, if they have any bluelinks in them, only the redlinks should be removed. So...
...should be made to look like this:
If a section is empty or is made empty due to all of its material being deleted, its heading must be deleted as well. (See
Outline of Guam#Local government in Guam. This heading will have no content once its red "main" entry is removed, and so this heading needs to be removed too.)
Many outlines have had redlinks sitting in them for years. They need to be cleaned up. There are so many outlines, this is infeasible to do manually.
Are there any bots that can do this?
If not, how would a script check a link on a page it was processing for whether or not the link was a redlink?
I look forward to your replies. The Transhumanist 00:37, 20 May 2013 (UTC)
@ The Transhumanist and Theopolisme:According to BAG, this Needs wider discussion..— cyberpower ChatOnline 18:55, 7 June 2013 (UTC)
The request is for an automated bot that scans through Category:Non-free images for NFUR review and attempts to automatically add a preformatted NFUR rationale when one is not is present.
This bot would not apply to all Non-free content and would be limited initally to {{ Non-free album cover}}{{ Non-free book cover}} {{ Non-free video cover}} and {{ Non-free logo}} tagged media where the image is used in an Infobox. Essentially this bot would do automatically, what I've been doing extensively in a manual fashion with FURME
In adding the NFUR the bot would also (having added a rationale) also add the |image_has rationale=yes param as well as leaving an appropriate note that the rationale was autogenerated.
By utilising a bot to add the types of rationale concerned automatically, contributer and admin time can be released to deal with more complex FUR claims, which do not have easy pre-formatted rationales or which require a more complex explanation.
Sfan00 IMG ( talk) 14:28, 4 June 2013 (UTC)
Here's my general outline of what it looks like the bot will need to do:
For all files in Category:Non-free images for NFUR review: If image meets the following constraints: - tagged with {{Non-free album cover}}{{Non-free book cover}}{{Non-free video cover}}{{Non-free logo}} - only used in one article - file must be the only non-free file in the article Then: - on the image page: - add some fairuse rationales to {{Non-free use rationale}} or {{Non-free use rationale 2}} - *** I will need rationales to insert *** - add "|image has rationale=yes" to {{Non-free album cover}}{{Non-free book cover}}{{Non-free video cover}}{{Non-free logo}} - add a new parameter "bot=Theo's Little Bot", to {{Non-free album cover}}{{Non-free book cover}}{{Non-free video cover}}{{Non-free logo}} - this might need additional discussion as far implementation/categorization
As you can see, there are still some questions -- #1, are there rationales prewritten that I can use (
Wikipedia:Use_rationale_examples, possibly...)? Secondly, as far as clarifying that it was reviewed by a bot, I think adding |bot=
parameters to {{
non-free album cover}} and such would be easy enough, although if you have other suggestions I'm all ears.
Theopolisme (
talk) 06:53, 7 June 2013 (UTC)
is a straight translation (and partial extension) of what FURME does, substituting the {{ Non-free use rationale}} types for the {{ <blah> fur}} types it uses currently. FURME itself needs an overhaul and re-integration into TWINKLE, but that would be outside the scope of a bot request.
still needs to be tweaked, but in essence it uses the |bot=
and |reviewed=
style as opposed to modification of
the license template. Note this also means it's easier to remove the tag once it's been human reviewed.
Sfan00 IMG (
talk) 11:10, 7 June 2013 (UTC)
Hi! I am looking for a bot that could update Wikipedia:Adopt-a-user/Adoptee's Area/Adopters's "currently accepting adoptees" to "not currently available" if they haven't made any edits after a certain period of time. This is because of edits like this where new user asks for adopters, but because the adopter has gone from Wikipedia, they never get on and just leave Wikipedia. Thanks, and if you need more clarification just ask! jcc ( tea and biscuits) 17:14, 6 June 2013 (UTC)
Heh, it's me again, with more "archive" bot requests. Here's another simple two: 1: If a |url= part of a reference has the web.archive.org) string in it, remove it and strip it back to the proper URL link. 2: If a reference has a |archiveurl tag, but is lacking in a |archivedate tag, grab the archiving date from the relevant part of the archive url, e.g http://web.archive.org/web/20071031094153/http://www.chaptersofdublin.com/books/Wright/wright10.htm would have "20071031" grabbed and formatted to "2007/10/31". [4] shows where I did this sort of thing manually. Lukeno94 (tell Luke off here) 20:49, 6 June 2013 (UTC)
The Wikipedia:Most missed articles -- often searched for, nonexistent articles -- has not been updated since a batch run in 2008. The German Wikipedia person, Melancholie (de:Benutzer:Melancholie) who did the batch run has not been active since 2009. Where would be a good place to ask for someone with expertise to do another run? It does not seem to fit the requirements of Wikipedia:Village pump (technical) since it is not a technical issue about Wikipedia. It is not a new proposal, and not a new idea. It is not about help using Wikipedia, and it is not a factual WP:Reference Desk question. I didn't find a WikiProject that looked promising. So I am asking for direction here. -- Bejnar ( talk) 19:52, 11 June 2013 (UTC)
Add {{ Hampton, Virginia}} to every page in Category:Neighborhoods in Hampton, Virginia. Emmette Hernandez Coleman ( talk) 23:09, 17 June 2013 (UTC)
This bot's job would be to stick {{ unreliable source}} next to references that are links to websites of sketchy reliability as third-party sources (e.g., blog hosting sites, tabloids, and extremely biased news sites). The list of these sites would be a page in the MediaWiki namespace, like MediaWiki:Spam-blacklist. In order to let through false positives (for instance, when the site is being cited as a primary source), the bot would add a hidden category (maybe Category:Unreliable source tags added by a bot) after the template to identify it as being added by the bot, and enable editors to check the tags. The hidden category would be removed by the editor if it was an accurate tagging. If not, the editor would comment out the {{ unreliable source}} tag, which would mean that it would be skipped over by the bot in the future. ❤ Yutsi Talk/ Contributions ( 偉特 ) 14:16, 13 June 2013 (UTC)
{{
unreliable source|bot=AwesomeBot}}
, however how you construct your url blacklist is probably most important. For example, examiner.com is on the blacklist because its just overall not a reliable source. However there are
many links to it, and those have all probably been individually whitelisted
citation needed. So tagging those wouldn't be useful. Did you have a few example domains in mind? That would help in evaluating your request.
Legoktm (
talk) 16:54, 13 June 2013 (UTC)
I'm a lawyer. One feature I really like is the ability to enter in a cases legal citation, like "388 U.S. 1" (the legal citation for Loving v. Virginia), as the "title" of a Wikipedia article, and have that entry automatically redirect to the correct page. However, this only exists for Supreme Court cases up to somewhere in the 540th volume of the U.S. reporter, and (as far as I know), not for any other cases.
It would be great if a bot could automatically create redirects between a legal citation and a page about that case. — Preceding unsigned comment added by Jmheller ( talk • contribs) 03:42, 16 June 2013 (UTC)
@ Jmheller: What should be done for cases like [7] where cases are only listed by docket number? Thegreatgrabber (talk) contribs 02:21, 21 June 2013 (UTC)
I just found out that the links to the GeoWhen database are dead. However, through a web search I discovered that the pages are still accessible – just add bak/
after the domain http://www.stratigraphy.org
and before the geowhen/
part. Some links have been fixed already, but not all. Note: An additional mirror is found under
http://engineering.purdue.edu/Stratigraphy/resources/geowhen/. --
Florian Blaschke (
talk) 16:03, 14 June 2013 (UTC)
Wikipedia:Categories for discussion/Working/Manual#Templates removed or updated - deletion pending automatic emptying of category has a very large backlog of hidden categories that have been renamed. Due to the way some or all of these categories are generated by the template the job queue alone doesn't seem able to process them and individual articles require null edits to get the updated category. Is it possible for a bot to have a crack at these? Timrollpickering ( talk) 16:10, 21 June 2013 (UTC)
This one should be straightforward:
Et voila. :) Lukeno94 (tell Luke off here) 14:03, 22 June 2013 (UTC)
This is a request for assistance in moving the assessment categories for WikiProject Rational Skepticism. Greg Bard ( talk) 20:02, 23 June 2013 (UTC)
Fixed all categories manually, updating all pages with my bot. -- Magioladitis ( talk) 21:45, 23 June 2013 (UTC)
I deleted all old categories, I fixed/normalised all banners and user wikiproject tags. -- Magioladitis ( talk) 23:25, 23 June 2013 (UTC)
Is there any way that a bot can go through and find instances where an article has the {{ unreferenced}} template and {{ references}}/<Reflist>/any other coding pertaining to references? I've seen a lot of instances of {{ unreferenced}} being used on articles that do have references. This seems like it should be an easy fix. Ten Pound Hammer • ( What did I screw up now?) 03:35, 20 June 2013 (UTC)
<ref>...</ref>
tags can contain notes instead of references, you might want to limit your search to citation templates, such as {{
cite web}} and {{
cite news}}.
GoingBatty (
talk) 02:52, 22 June 2013 (UTC)
<ref>...</ref>
tags to contain a note:
Battle of Breitenfeld (1642).
GoingBatty (
talk) 12:22, 22 June 2013 (UTC)
Do like I did here in disambiguation pages per WP:DABSTYLE. -- Magioladitis ( talk) 15:39, 23 June 2013 (UTC)
Hi! Is anyone interested in writing a bot that is used to articles to the Sorani Kurdish Wikipedia (CKB) about Iraqi cities using census data?
I found Iraqi census data at http://cosit.gov.iq/pdf/2011/pop_no_2008.pdf ( Archive) and the idea is something like User:Rambot creating U.S. cities with census data
Thanks WhisperToMe ( talk) 00:11, 26 June 2013 (UTC)
I was wondering if anyone would be able to create a bot that would be able to copy the information about planets detected by the Kepler spacecraft from the Extrasolar Planets Encyclopaedia ( link) to our list of planets discovered using the Kepler spacecraft. Rather than merely going to the list, it would be ideal if the bot could follow the link for each Kepler planet and get the full information from there, rather than merely looking at the catalog. The information in the EPE about Kepler planet is in turn copied from the Kepler discoveries catalog, which is in the public domain but is unfortunately offline at the moment (requiring us to use the copyrighted EPE. In addition to the basic information, I would like it if our bot were able to calculate surface gravity where possible based on mass/(r^2). Thanks, Wer900 • talk 18:08, 27 June 2013 (UTC)
Hi. I haven't historically been a big editor on Wikipedia. Though I use it from time to time. I realize that there are probably a number of bots at work using various methods to target entries for improvement. However, I just wanted to add my two cents on a method which may or may not be in use.
First, however, some quick background. I am currently taking a Data Science class and for one of my assignments I developed a script which selects a random Wikipedia article and does the following:
1) Counts total words and total sources (word count does not include 'organizational' sections such as References, External Links etc.
2) Uses all words contributing to the word count to assess the overall sentiment of the text. For this, I used the AFINN dictionary and the word count to get an average sentiment score per word.
3) For each section and sub-section (h2 and h3) in the page which is not organizational in nature (see above definition) counts the number of words, citations and as with item 2 gets a sentiment score for the section/sub-section
So my thought on using this script is as follows:
If it was used to score a large number of Wikipedia pages, we could come up with some parameters on which a page and its sections and subsections could be scored.
1) For all articles, word count, source count and sentiment score. 2) For all sections and sub-sections, word count, citation count and sentiment score. 3) For pages with sources, a sources per word score 4) For sections with citations, a words per citation score
For all of these parameters, the scores from the sample could be used to determine what sort of statistical distribution they follow. A bot could then scan the full set of wikipedia articles and flag those which are beyond some sort of tolerance limit.
Additionally, data could be collected for sections which commonly occur (Early Life, Private Life, Criticisms etc.) to establish expected distributions for those specific section types. For example, we might expect the sections labeled Criticisms would, on average, have a more negative sentiment than other sections.
I hope this all makes sense and perhaps some or all of it is being done. I look forward to hearing from some more experienced Wikipedians on the idea. Additionally, for sections which com — Preceding unsigned comment added by 64.134.190.157 ( talk) 18:46, 21 June 2013 (UTC)
Put {{ South Alexandria}} on every article listed on it, and put the listed articles in a category called Category:South Alexandria, Virginia. Emmette Hernandez Coleman ( talk) 09:20, 29 June 2013 (UTC)
Any chance someone could write a bot to license tag these quickly?
[6=1&templates_no=db-i3&sortby=uploaddate&ext_image_data=1&file_usage_data=1]
Most seem to be reasonably straightforward Sfan00 IMG ( talk) 16:20, 18 June 2013 (UTC)
etc... There may be some others, but this will start to clear out the 5000 or so I've found on the catscan query notes over on WP:AWB. Sfan00 IMG ( talk) 10:14, 23 June 2013 (UTC)
Back in March there was consensus in Proposals to test out Template:COI editnotice on the Talk page of articles about extant organizations, to see if it increases the use of {{ Request edit}} and reduces COI editing. A bot was approved in April to apply the template to Category:Companies based in Idaho. The BOT request said it would effect 1,000+ articles, which would be enough of a sample to test, but it looks like it was only applied to about 40 articles? I am unsure if the bot was never run on the entire category, or if we need a larger category. The original bot-runner is now retired. Any help would be appreciated. CorporateM ( Talk) 14:12, 30 June 2013 (UTC)
I know that previously PhotoCatBot has tagged articles fitting this criteria, but is there any way that we could get a new bot to help finish up where this bot left off almost three years ago? Thanks! Kevin Rutherford ( talk) 02:16, 1 July 2013 (UTC)
Make sure all articles listed on {{ Arlington County, Virginia}} have the template, and are in Category:Neighborhoods in Arlington County, Virginia.
Create redirects to these articles in the format "X, Virginia" and "X, Arlington, Virginia" and "X". For example all of the following should redirect to Columbia Forest Historic District: Columbia Forest, Virginia, Columbia Forest, Arlington, Virginia, and Columbia Forest, Columbia Forest Historic District, Virginia and Columbia Forest Historic District, Arlington, Virginia. Emmette Hernandez Coleman ( talk) 10:41, 3 July 2013 (UTC)
Howdy, I haven't had the need for a bot before, but I'm organizing a meetup and would like help in posting invites to the folks on this list. I can come up with a short message, is that all I need to provide? Or is there anymore info needed? Thanks, Olegkagan ( talk) 00:44, 27 June 2013 (UTC)
Okay, and it would be nice to have the message to be added in case a trial is requested of me, thanks. Hazard-SJ ✈ 02:19, 3 July 2013 (UTC)
Please could somebody "Subst:" all 84 article-space transclusions of the German-language {{ Infobox Unternehmen}}, which is now a wrapper for the English-language {{ Infobox company}}, as in this edit? That may be a job for AWB, which unfortunately I can't use on my small-screen netbook. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:08, 10 July 2013 (UTC)
Removes all articles in {{
Peconic County, New York}} that are NOT in the following categories:
Category:Sag Harbor, New York,
Category:Riverhead (town), New York,
Category:Shelter Island (town), New York,
Category:Southampton (town), New York,
Category:Southold, New York.
Emmette Hernandez Coleman (
talk) 04:13, 12 July 2013 (UTC)
Or arterially remove all articles that ARE in the flowing categories:
Category:Babylon (town), New York,
Category:Brookhaven, New York,
Category:Huntington, New York,
Category:Islip (town), New York,
Category:Smithtown, New York.
Emmette Hernandez Coleman (
talk) 04:21, 12 July 2013 (UTC)
Never-mind. The template might be deleted, so no point in putting that effort into it until we know it will be kept. Emmette Hernandez Coleman ( talk) 09:08, 13 July 2013 (UTC)
Given NoomBot is under since April, maybe another bot could be made to make\update the ever-useful Book Reports. igordebraga ≠ 01:31, 13 July 2013 (UTC)
Add {{ Orleans Parish, Louisiana}} to every page it lists. Emmette Hernandez Coleman ( talk) 22:33, 13 July 2013 (UTC)
Also {{ Neighborhoods of Denver}}. At the rate I'm going I'll probably create a few more navboxes in the next few days, so it would be easier to do a bunch of navboxes together don't bother with either of these yet. Emmette Hernandez Coleman ( talk) 08:20, 14 July 2013 (UTC)
I've just added an
hCalendar microformat to {{
Infobox poem}}, so a Bot is now required, to apply {{
Start date}} to the |publication_date=
parameter.
The logic should be:
This is related to a larger request with clear community consensus, which has not yet been actioned; I'm hoping that this smaller, more manageable task, will attract a response, which can later be applied elsewhere. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:12, 14 July 2013 (UTC)
df=y
especially important? At the moment, I haven't implemented it, since according to my tests it would slow down the script a fair bit.
Theopolisme (
talk) 17:09, 14 July 2013 (UTC)
I would like to suggest a bot that disables autoblock on softblocks - if an admin accidentally enables autoblock on a softblock, my suggested bot will automatically DISable autoblock and enable account creation, e-mail, and talk page access if any, some or all of them are disabled. The bot will then lift the autoblock on the IP address affected by the soft-blocked user. 76.226.117.87 ( talk) 03:09, 15 July 2013 (UTC)
I already said that I wanted a bot to change block settings (see the declined request made by 76.226.117.87), but now I have fixed it:
The process goes like this: 1. An administrator ENables autoblock when blocking an account that should have autoblock DISabled. 2. The bot I'm suggesting will change the block settings for the affected account so that account creation, e-mail, and talk page access are all allowed, if any of them are disabled. The bot will also disable autoblock on the affected account, and lift the autoblock on the IP address recently used by the blocked account. 3. The resulting log entry should look something like this (timestamps will vary, and some extra comments are added that are not normally present in block log entries):
I thought of a bot that corrected simple spelling errors, such as beacuse-because and teh-the. buff bills 7701 22:17, 15 July 2013 (UTC)
Context: Citing references is important for medical articles (under WP:MED or related WikiProjects) though it is important for other articles as well. As books have ISBN, almost all the renowned medical journals have Pubmed listing and their articles bear a PMID. Pubmed serves as a global quality regulatory and enlisting body for medical articles and if a medical article does not have a PMID, chances are that the journal is not a popular one and therefrore there is a possibiliy that it does not maintain quality issues that are to be adhered to. Other medical articles have Digital object identifier or DOI (with or without having PMID) which serves to provide a permanent link redirecting to its present url. Some Pubmed articles are freely accessible and have PMC (alongside PMID) which is therefore an optional parameter. Thus, if a <ref></ref> is having neither PMID nor DOI, chances ar that 1. the article has a PMID (most cases) but the <ref></ref> tag lacks its mention or 2. The article lacks a PMID or DOI and its not a problem of the <ref></ref> placed.
I feel the requirement for two different bots.
Utility: These bots would enable the editors of medical articles to make the mentioned references more reliable and verifiable and would encourage users to use this template while placing references. Diptanshu Talk 16:15, 10 July 2013 (UTC)
The folks at the Village Pump suggested that I post this request here.
I have been correcting errors in citations, and I have noticed that pretty much every Russia-related article I edit contains an incorrectly formatted parameter, "language=ru", in its citation parameters. (The "language" parameter takes the full language name, not the two-letter code.)
You can see an example of one of these citations here. Note that the rendered reference says "(in ru)" instead of the correct "(in Russian)".
It occurred to me that someone clever with a script or bot or similar creature might be able to do a semi-automated find and replace of "language=ru" in citations with "language=Russian".
Is this a good place to request such a thing? I do not have the time or skills to take on such a project myself. Thanks. Jonesey95 ( talk) 14:06, 16 July 2013 (UTC)
In the pump thread, using Lua to automatically parse the language codes was suggested -- I think that would definitely be preferable if possible, rather than having a bot make a boatload of fairly trivial edits. Theopolisme ( talk) 14:27, 16 July 2013 (UTC)
Wikipedia is too important and too useful of a resource to have citations behind paywalls if there is another possible reference. In order to draw attention to references that need improving/substitution it would be nice if there was a bot that would tag articles that are behind a paywall. I realize that some newspapers slowly roll articles behind a paywall as time passes. However other newspapers have all their content behind a paywall. A good example is The Sunday Times. You can click on any link on http://www.thesundaytimes.co.uk and you will be presented with "your preview of the sunday times." If wikipedians like myself enjoy contributing by verifying citations it is next we cant verify sunday times citations for free. When I see a paywall tagged citation I often try to find another citation and substitute it. A bot would be helpful for this. DouglasCalvert ( talk) 23:51, 16 July 2013 (UTC)
WMF has turned Visual Editor on for IP accounts now, and the results are as expected: Filter 550 shows that a significant volume of articles are getting mutilated with stray wikitext. It has been proposed to set the filter to block the edits that include nowikis that indicate that the combination of VE and an unknowing editor has caused a problem.
I'd personally rather see this sort of mess cleaned up by a bot, and a message left on a user talk page that asks the editor either to stop using wiki markup or to stop using VE. I think it's possible to detect strings generated by VE (basically, it surrounds wikitext with nowiki tags), and figure out what the fix is (in some [most?] cases, just remove the nowiki tags), similar to how User:BracketBot figures out that an edit has broken syntax. Given that the problem is on the order of magnitude of 300 erroneous edits per day, is it possible to move with all deliberate pace to field such a bot?
(Background: see WP:VPR#Filter 550 should disallow.) -- John Broughton (♫♫) 03:31, 16 July 2013 (UTC)
<nowiki>...</nowiki>
tags in main namespace articles, and suggest a fix for each of them. Basically suggesting to just remove the tags, except for specific cases. I've only one in mind for now : the nowiki at the beginning of a line with whitespace characters after it, the whitespace characters should be removed too.<nowiki>...</nowiki>
in main namespace and suggest a fix. To active this detection, edit
Special:MyPage/WikiCleanerConfiguration and add the following contents (with the <source>...</source>
tags):# Configuration for error 518: nowiki tags
error_518_bot_enwiki=true END
<nowiki>...</nowiki>
tags are found and suggestions are given to fix them. It's quite basic, so if you think of any enhancement, tell me. --
NicoV (
Talk on frwiki) 22:52, 17 July 2013 (UTC)
I came here from the WP:Village Pump (proposals) page, and I suggest instead of an autofix bot, maybe a bot much like User:DPL bot? They could notify everyone who accidentally triggered the filter and each person could go back and fix it. Unless that would create lots of spam? Just a thought. kikichugirl inquire 22:21, 20 July 2013 (UTC)
It seems to me that VE is leading to a rise in external links [9] in article text, as refs. Can an autobot move the link to refs or EL with an edit note for a human to follow-up? Thanks. Alanscottwalker ( talk) 10:48, 23 July 2013 (UTC)
DASHBot ( talk · contribs) used to create, and/or periodically update, a list of unreferenced biographies of living persons for a given Wikiproject (see User:DASHBot/Wikiprojects). However, that bot has been blocked since March. I'm wondering if another one can accomplish this same task. I'm asking on behalf of WP:JAZZ (whose list is at Wikipedia:WikiProject Jazz/Unreferenced BLPs) but there were a lot of other WikiProjects on that list, as well (I'd already removed WP:JAZZ, though). -- Gyrofrog (talk) 21:55, 23 July 2013 (UTC)
I have noticed that some links contain Google Analytics tracking parameters. It seems that wikipedia links should not help companies track user clicks. I have even noticed some advert-like entries about companies contain links with custom GA campaign parameters to target clicks from wikipedia to the company page/bog/etc. Removing these GA tracking parameters seems like a great task for a bot. All google analytics parameters begin with utm. The basic parameters are:
Does this sound doable?
DouglasCalvert ( talk) 03:56, 13 July 2013 (UTC)
utm_
parameter, correct?
Theopolisme (
talk) 21:56, 13 July 2013 (UTC)I've
written the basic program, and it successfully made
this edit (incorporating the canonical url) -- if the canonical url isn't available, it'll just use
regular expression to strip the utm_
parameters. Douglas, is this what you were looking for?
Theopolisme (
talk) 01:33, 14 July 2013 (UTC)
@ Hazard-SJ: could I steal the source code that you used to generate that list of urls (I assume it involved a db replica somewhere or other, they're all basically the same *wink*)? That way I won't have to manually crawl every...single...page... Theopolisme ( talk) 01:36, 14 July 2013 (UTC)
Could the same task be useful in trimming cruft from Google Books links, like this edit? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:30, 17 July 2013 (UTC)
Anyone? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:12, 21 July 2013 (UTC)
page
, for example, definitely needs to stay -- are there others? I wouldn't want to remove something that would alter the appearance of the page for the person who clicks on the link (besides, say, removing extraneous search terms and such).
Theopolisme (
talk) 00:35, 22 July 2013 (UTC)
It seems like it would be of great use to create a bot that finds and lists the most vandalised pages on the wiki, and create a list article or essay that regularly updates the list, in order to alert all wikipedians to which pages require the most monitoring and reverting. Superwikiwalrus ( talk) 14:12, 27 July 2013 (UTC)
Hi! This is my first bot request so bear with me. The bot I am requesting will do/perform the following functions:
Does anyone here have any thoughts on the feasibility of this? ★★ KING RETROLORD★★ 09:10, 22 July 2013 (UTC)
One more request, could the bot check there is at a minimum 1 citation per paragraph. Thanks, ★★ KING RETROLORD★★ 09:59, 22 July 2013 (UTC)
(==|Category:|\[\[File:|\[\[Image:|{{Authority control|{{.*}}|^<.*?>|^;|^\|)
. It's still not completely foolproof, though, and still gets some false positives.
Theopolisme (
talk) 04:10, 23 July 2013 (UTC)
Yes, when running "for real" the bot will report on all current nominations. As far as where the reports end up... I would like to just use sections on User:Theo's Little Bot/GAN, since that prevents having to creating a ton of new pages—unless you have a reason why multiple pages would be beneficial. Theopolisme ( talk) 05:09, 23 July 2013 (UTC)
{{
User:Theo's Little Bot/GAN/link}}
can be used to automatically link to a specific article's listing.
Theopolisme (
talk) 06:15, 23 July 2013 (UTC)@ GoingBatty: I've implemented basic spell checking using Wikipedia:Lists of common misspellings/For machines ( commit). I initially tried using a larger corpus (courtesy of NLTK+Project Gutenberg), but it was taking way too long to process each article (5-8 minutes), so I settled for Wikipedia:Lists of common misspellings/For machines instead. It's not as complete, but should still catch "common misspellings." ;) Your thoughts? Is this adequate? Theopolisme ( talk) 19:17, 23 July 2013 (UTC)
@ Theopolisme: Can this bot be used to scan current Good Articles? The bot might be able to select some articles that might no-longer meet the GA criteria for examination by human users. If the user decides it no-longer meets the criteria, he can open a GAR.-- FutureTrillionaire ( talk) 01:35, 25 July 2013 (UTC)
Looks like the bot is done. However, there are issues. I checked the about 20 of the articles listed, and it appears that the reason the vast majority of these articles were selected is due to having at least one dead link tag in the article. However, this is not very useful because dead links do not violate any of the GA criteria. I saw only one article that contained an orange tag, and few articles only containing citation needed tags or disambiguation needed tags. Is it possible for the bot to ignore dead link tags and other less serious tags? I was hoping to just see articles with orange tags displayed at the top, or something like that.-- FutureTrillionaire ( talk) 01:50, 27 July 2013 (UTC)
{{
Ambox}}
(that's "the orange" you were talking about). Thoughts? Thanks for bearing with me on this. (Another note: for some reason, the bot listed articles from least->most tags...fixed.)
Theopolisme (
talk) 02:33, 27 July 2013 (UTC)
{{
Current}}
and {{
Split}}
.--
FutureTrillionaire (
talk) 02:59, 27 July 2013 (UTC)
{{
Current}}
and {{
Split}}
wouldn't be included, since they aren't also in
Category:Cleanup templates. Here's a
page to enter templates for the whitelist, though, should you stumble upon anything.
Theopolisme (
talk) 03:39, 27 July 2013 (UTC)