This page is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
There seems to be some problems with the notification of users named on DRN cases:
Wikipedia talk:Dispute resolution noticeboard#Autonotification
Wikipedia talk:Dispute resolution noticeboard#Autonotification
-- Guy Macon ( talk) 01:18, 14 November 2012 (UTC)
I thought that you deserved something a bit extra for all of the amazing work you've done for the project.
I've nominated you for a gift from the Wikimedia Foundation! |
Legoktm ( talk) 20:54, 14 November 2012 (UTC)
Your bot User:EarwigBot is doing a fine job in maintaining Template:AFC statistics, but there are a small number of entries there that shouldn't IMO be there, but that the bot adds nevertheless. I am not sure of the bot code needs tweaking, or if anything needs to be done about these pages instead. The pages I mean are the ones I manually removed here. This includse pages like Inch Parish, Wigtownshire, a redirect where neither the source nor the target seem to have any AfC categories. Can you take a look? It's not urgent, and there is no reason to stop this bot task obviously, but the lists are very long and removing a few that have no place there would make it a bit lighter. Fram ( talk) 10:20, 13 November 2012 (UTC)
Template:AFC statistics is no longer working, exceeds limits, and gives only one of the four sections anymore. Fram ( talk) 16:16, 22 November 2012 (UTC)
...from Category:Undated AfC submissions? — Wylie Coyote 13:21, 28 November 2012 (UTC)
Articles for creation is desperately short of reviewers! We are looking for urgent help, from experienced editors, in reviewing submissions in the pending submissions queue. Currently there are 3329 submissions waiting to be reviewed and many help requests at our help desk.
If the answer to these questions is yes, then please read the
reviewing instructions and donate a little of your time to helping tackle the backlog. You might wish to add {{
AFC status}} or {{
AfC Defcon}} to your userpage, which will alert you to the number of open submissions.
Plus, reviewing is easy when you use our new semi-automated
reviewing script!
|
Hi. Thank you for your recent edits. Wikipedia appreciates your help. We noticed though that when you edited Digite, Inc., you added a link pointing to the disambiguation page Mountain View ( check to confirm | fix with Dab solver). Such links are almost always unintended, since a disambiguation page is merely a list of "Did you mean..." article titles. Read the FAQ • Join us at the DPL WikiProject.
It's OK to remove this message. Also, to stop receiving these messages, follow these opt-out instructions. Thanks, DPL bot ( talk) 10:55, 30 November 2012 (UTC)
The Earwig: I was recently creating a page on Center for Hispanic Leadership and it was deleted today due to ambiguous advertising/promotion. I realize now that I had inappropriate information in the entry and completely agree with the deletion. That being said, I would like to create the page again listing only information about the Center for Hispanic Leadership (CHL), the Hispanic Training Center and the CHL Chapters. I plan to refer to the CHL Founder, Glenn Llopis, who has a page on Wikipedia already, just once as a point of reference and then list reputable articles and references at the end of the page.
Again, I wanted to reach out and acknowledge my misstep and would like to be afforded the opportunity to create a CHL page that is in-line with Wikipedia's guidelines.
Please let me know your feedback.
Kind regards, Marisa Salcines — Preceding unsigned comment added by Gabri7elle ( talk • contribs) 04:25, 3 December 2012 (UTC)
Hi, I have created a new entry for Center for Hispanic Leadership. Please let me know if it meets the standards.
Thanks 06:21, 4 December 2012 (UTC) — Preceding unsigned comment added by Gabri7elle ( talk • contribs)
Apparently your reviewed Template:Derry and passed it. I feel that you have called your competence as a reviewer into question due to your failure to properly flag the blatant bias in the template whilst going on to add it to several articles. The overt nationalist/republican bias of the template is plain to see straight from the off and until it is fixed it should not be added to articles. Mabuska (talk) 14:41, 4 December 2012 (UTC)
Could you please check if your copyvio detector is still working? I am getting the following error message:
Error message
|
---|
Error ! SiteNotFoundError: Site 'all' not found in the sitesdb. <%include file="/support/header.mako" args="environ=environ, cookies=cookies, title='Copyvio Detector', add_css=('copyvios.css',), add_js=('copyvios.js',)"/>\ <%namespace module="toolserver.copyvios" import="main, highlight_delta"/>\ <%namespace module="toolserver.misc" import="urlstrip"/>\ <% query, bot, all_langs, all_projects, page, result = main(environ) %>\ % if query.project and query.lang and query.title and not page:The given site (project=${query.project | h}, language=${query.lang | h}) doesn't seem to exist. It may also be closed or private. <a href="https://${query.lang | h}.${query.project | h}.org/">Confirm its URL.</a> /home/earwig/git/earwigbot/earwigbot/wiki/sitesdb.py, line 159: raise SiteNotFoundError(error) /home/earwig/git/earwigbot/earwigbot/wiki/sitesdb.py, line 186: namespaces) = self._load_site_from_sitesdb(name) /home/earwig/git/earwigbot/earwigbot/wiki/sitesdb.py, line 135: site = self._make_site_object(name) /home/earwig/git/earwigbot/earwigbot/wiki/sitesdb.py, line 340: return self._get_site_object(name) /home/earwig/git/earwigbot/earwigbot/wiki/copyvios/exclusions.py, line 112: site = self._sitesdb.get_site(sitename) /home/earwig/git/earwigbot/earwigbot/wiki/copyvios/exclusions.py, line 149: self._update(sitename) /home/earwig/git/earwigbot/earwigbot/wiki/copyvios/exclusions.py, line 154: self.sync("all") /home/earwig/git/earwigbot/earwigbot/wiki/copyvios/__init__.py, line 146: self._exclusions_db.sync(self.site.name) ./toolserver/copyvios/checker.py, line 28: result = page.copyvio_check(max_queries=10, max_time=45) ./toolserver/copyvios/__init__.py, line 23: page, result = get_results(bot, site, query) pages/copyvios.mako, line 4: <% query, bot, all_langs, all_projects, page, result = main(environ) %>\ /home/earwig/.local/solaris/lib/python2.7/site-packages/Mako-0.7.2-py2.7.egg/mako/runtime.py, line 817: callable_(context, *args, **kwargs) |
By the way, thank you for creating the tool! It's very useful – I have been using it when reviewing AfC submissions. The Anonymouse ( talk • contribs) 16:42, 5 December 2012 (UTC)
Why did u decline my article?? — Preceding unsigned comment added by 66.87.97.103 ( talk) 01:11, 12 December 2012 (UTC)
The WikiProject Articles for creation newsletter | ||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Hi, so at WT:DRN, there is a discussion on making subpages for each case. (Sort of a WP:SPI style) Would you modify the bot if the proposal succeeds (very likely to succeed). If you will, how long would it take?
Copied from Steven Zhang to explain SPI style:
Essentially, how this works, is essentially, instead of each dispute being a thread on the one page, each dispute would have it's own page that is created by the filer, very similar to the format of WP:SPI. When a dispute is closed (as resolved or otherwise), it's archived, and can be easily referred back to if a dispute is filed again. Potential positives with the change is a more organised format, and easier to look back on past discussions. Negatives include the loss of all cases easily viewable on a watchlist, [ ... ], and criticism of increased bureaucracy.
~~ Ebe 123~~ → report 14:43, 28 December 2012 (UTC)
I have found a mirror while using Earwigbot's copyvio detector -[wpedia.goo.ne.jp/]]. Can you add it to the ignore list please? (great tool BTW!) Mdann52 ( talk) 14:00, 17 January 2013 (UTC)
So, I think there's something wrong with the copyvio detector. I ran [1] and got "Jatin Seth is a suspected violation of en.wikipedia.org/wiki/Jatin_Seth." I'm pretty sure that's not supposed to happen. — Darkwind ( talk) 03:24, 19 January 2013 (UTC)
See Wikipedia talk:WikiProject Articles for creation/2014 5#Template limit means backlog list doesn't show up in /Submissions for discussion. davidwr/( talk)/( contribs)/( e-mail) 22:37, 21 January 2013 (UTC)
Hi. I know I have a global account, but your tool says otherwise. Also, if I fail to enter a username (which it claims is not required), it crashes completely. Just checking if you were aware of these issues (or if the tool is supposed to work at all!) Mdann52 ( talk) 13:21, 29 January 2013 (UTC)
I've noticed the bot hasn't updated the AFC statistics page in two days. Was wondering if this was done on purpose or it's just that it can't handle the backlog. Funny Pika! 03:11, 9 February 2013 (UTC)
Hi. I just wanted to drop you a quick note to say that I read mwparserfromhell's README the other day and found it to be some of the best documentation I'd ever read. Very nicely done. :-) -- MZMcBride ( talk) 04:31, 13 February 2013 (UTC)
Hello! In case you were unaware, the Template:AFC statistics/sandbox hasn't been updating for some time. The Template:AFC statistics seems broken; since they are connected, maybe it's all part of the same problem. Thanks for creating these lists; I find them very helpful. — Anne Delong ( talk) 13:18, 18 February 2013 (UTC)
I've just removed an extra template:DRN archive top from a closed thread at WP:DRN. This is the second one that's turned up recently. After a little look through the page history, I discovered the culprit: EarwigBot! ( [2] and [3].) It seems to only do it when the 'do not archive until' comment isn't removed, so it can be prevented by always remembering to remove the comment when closing, but the bot clearly isn't working properly as no bottom template is added and even if it was, the template is being added to already collapsed threads, making it pointless. CarrieVS ( talk) 15:04, 21 February 2013 (UTC)
Hi Earwig:
I am asking what I (actually my student) needs to do to improve Thomas E. Emerson's Wikipedia page that was declined recently. I see the request for independent references is one thing. http://en.wikipedia.org/wiki/Wikipedia_talk:Articles_for_creation/Thomas_E._Emerson
Dr. Emerson is one of the most famous archaeologists doing Eastern North American prehistory, the author of numerous well received books, edited volumes, and papers. I am more than willing to edit the page and organize it better, and am wondering how to proceed to meet your concerns about independent references. In our field we have book reviews that are published in academic journals, and some of those could be cited. The books themselves that he wrote are published, and could also be referenced.
FYI, I am a professor of archaeology at the University of Tennessee, and I have my students write articles for Wikipedia on a famous archaeologist, archaeological site, or archaeological project every time I teach an advanced undergraduate class in North American archaeology, provided there is no previously published article on the subject on Wikipedia.
I am not as proficient at Wikipedia as I could be, but believe it is a critically important reference tool, which is why I support it and try to build its intellectual content. My students have posted >100 articles over the past 5 years. I am always puzzled why some articles go up with little or no comment, while others have more problems. I appreciate what you all do, and just want to learn to do it better.
Feel free to email me back if you want... dander19@utk.edu
Thanks! 160.36.65.208 ( talk) 21:16, 24 February 2013 (UTC) David
David G. Anderson, Ph.D., RPA Professor and Associate Head Department of Anthropology The University of Tennessee 250 South Stadium Hall Knoxville, Tennessee 37996-0720 dander19@utk.edu http://web.utk.edu/~anthrop/faculty/anderson.html http://pidba.tennessee.edu/ http://bellsbend.pidba.org/
WikiProject AFC is holding a one month long Backlog Elimination Drive!
The goal of this drive is to eliminate the backlog of unreviewed articles. The drive is running from March 1st, 2013 – March 31st, 2013.
Awards will be given out for all reviewers participating in the drive in the form of barnstars at the end of the drive.
There is a backlog of over 2000 articles, so start reviewing articles! Visit the
drive's page and help out!
Delivered by User:EdwardsBot on behalf of Wikiproject Articles for Creation at 13:54, 27 February 2013 (UTC)
March 2, 2013
Hello Earwig,
I am writing to you because you have given me pointers about how to improve an article about the conductor and producer Peter Tiboris. I took your advice, reworked the article, and resubmitted it about three weeks ago. The article has been rejected again, by an admittedy inexperienced editor, due to lack of evidence of the subject's notability. This is the exact same rejection form used previously. Before resubmitting the article, I added numerous citations of articles and reviews that appeared in <The New York Times/>, <The New Yorker/>, and other well-known and verifiable sources.
When I click on the link to edit the rejected article, I am taken to an article about a rap singer. I don't quite understand this connection. Mr. Tiboris is well known in the music industry as a classical music conductor and a producer of concerts. He has conducted orchestras in 20 countries and produced 1200 concerts throughout the world, primarily in New York's Carnegie Hall, over a 30-year period.
I don't know what to do next which is why I am writing to you.
Many thanks for your suggestions about Article: Peter Tiboris.
Sincerely,
Dale Zeidman Dzeidman ( talk) 01:03, 3 March 2013 (UTC)
Hello, you seem to be responsible for mwparserfromhell development, so hope you don't mind me asking: I posted a question at
WP:BON regarding parsing out ref tags with mwparserfromhell v. 0.1.1. Doesn't seem to be working for me, should I be expecting it to? Appreciate your input, cheers...
Zad
68
15:49, 13 March 2013 (UTC)
Zad
68
00:15, 14 March 2013 (UTC)
Zad
68
01:02, 14 March 2013 (UTC)
{{foo|bar}}
gets converted into the tokens [TemplateOpen(), Text(text="foo"), TemplateParamSeparator(), Text(text="bar"), TemplateClose()]
, which then gets converted into a Template
object with the data stored within it. The series of objects that make up the wikicode are then wrapped in a Wikicode
object, which has methods like filter_templates()
. The entire process is a lot more complex than just regex, because regex is prone to catastrophic failures when the input is not exactly what it expects it to be, whereas a tokenizer can handle confusing cases, like nested templates and wikicode that looks like a template but actually isn't because there's an invalid character in the template's name. The regex necessary for that to work properly would be far too complex, and probably impossible. —
Earwig
talk 01:38, 14 March 2013 (UTC)
Zad
68
02:37, 14 March 2013 (UTC)
So after installing git, the Python-dev libraries (needed for Python.h because it looks like it's doing some C compiling), and doing a little reading, I got as far as getting the dev mwparserfromhell lib downloaded, installed locally and built, and it looks like I'm using it, but processing behavior is same as before, the ref tags aren't parsed out:
Extended content
|
---|
$ git clone -b feature/html_tags git://github.com/earwig/mwparserfromhell.git ... $ python setup.py install --user running install running bdist_egg running egg_info ... Adding mwparserfromhell 0.2.dev to easy-install.pth file Installed /home/zad68/.local/lib/python2.7/site-packages/mwparserfromhell-0.2.dev-py2.7-linux-x86_64.egg Processing dependencies for mwparserfromhell==0.2.dev Finished processing dependencies for mwparserfromhell==0.2.dev $ python Python 2.7.3 (default, Aug 1 2012, 05:14:39) [GCC 4.6.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import mwparserfromhell >>> mwparserfromhell.__version__ u'0.2.dev' >>> mwparserfromhell.__file__ '/home/zad68/.local/lib/python2.7/site-packages/mwparserfromhell-0.2.dev-py2.7-linux-x86_64.egg/mwparserfromhell/__init__.pyc' >>> text = "I has a template!<ref>{{foo|bar|baz|eggs=spam}}</ref> See it?" >>> wikicode = mwparserfromhell.parse(text) >>> wikicode.filter_templates() [u'{{foo|bar|baz|eggs=spam}}'] >>> wikicode.filter_tags() [] >>> wikicode.filter_text() [u'I has a template!<ref>', u'</ref> See it?'] |
Is there some development flag or something I have to enable or something to get it to parse the tags? Hope you don't mind me stinking up your Talk page with this, if you'd rather do this somewhere else let me know. Also my email is enabled, that's good for me too. Any help/direction appreciated, cheers...
Zad
68
03:27, 14 March 2013 (UTC)
Maybe this has something to do with where you said You would also need to explicitly use the Python tokenizer instead of the C extension
but I'm not sure how to do that.
Zad
68
03:34, 14 March 2013 (UTC)
>>> import mwparserfromhell
>>> from mwparserfromhell.parser.tokenizer import Tokenizer
>>> from mwparserfromhell.parser.builder import Builder
>>> text = "I has a template!<ref>{{foo|bar|baz|eggs=spam}}</ref> See it?"
>>> wikicode = Builder().build(Tokenizer().tokenize(text))
>>> wikicode.filter_templates(recursive=True) # 'recursive' needed because template is nested inside tag
u'{{foo|bar|baz|eggs=spam}}'
>>> wikicode.filter_tags()
u'<ref>{{foo|bar|baz|eggs=spam}}</ref>'
>>> tag = wikicode.filter_tags()[0
>>> tag.tag
u'ref'
>>> tag.type == tag.TAG_REF
True
>>> tag.contents
u'{{foo|bar|baz|eggs=spam}}'
>>> tag.contents.filter_templates()[0.name
u'foo'
>>> tag.contents.filter_templates()[0.params
u'bar', u'baz', u'eggs=spam'
Hope that helps. Oh, and my talk page is fine for this sort of discussion. — Earwig talk 03:59, 14 March 2013 (UTC)
Zad
68
12:56, 14 March 2013 (UTC)git pull
-ing the repository):>>> import mwparserfromhell
>>> mwparserfromhell.parser.use_c = False
>>> wikicode = mwparserfromhell.parse(text)
Hey Earwig, try testing this:
wikicode = Builder().build(Tokenizer().tokenize('<ref name="a-b">'))
Error I get is: in my local tokenizer.py, line 472, in _actually_close_tag_opening:
if isinstance(self._stack[-1], tokens.TagAttrStart): IndexError: list index out of range
Only seems to occur when: 1) It's a ref tag, 2) name parameter is specified and has a value with certain characters in it, like - (hyphen) or = (equals), 3) the name parameter value is in double-quote. Bug?
Zad
68
14:10, 14 March 2013 (UTC)
Self-closing tags don't seem to be handled properly:
Extended content
|
---|
$ python Python 2.7.3 (default, Aug 1 2012, 05:14:39) [GCC 4.6.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import mwparserfromhell from mwparserfromhell.parser.tokenizer import Tokenizer from mwparserfromhell.parser.builder import Builder >>> >>> >>> # Without self-closing ref tag, works >>> wikicode = Builder().build(Tokenizer().tokenize('I has a template!<ref name=foo>{{bar}}</ref>')) >>> wikicode.filter_tags() [u'<ref name=foo>{{bar}}</ref>'] >>> wikicode.filter_tags(recursive=True) [u'<ref name=foo>{{bar}}</ref>'] # With self-closing tag, doesn't work >>> wikicode = Builder().build(Tokenizer().tokenize('I has a template!<ref name=foo>{{bar}}</ref><ref name=baz/>')) >>> wikicode.filter_tags() [] >>> wikicode.filter_text() [u'baz'] >>> wikicode.filter_tags(recursive=True) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/zad68/.local/lib/python2.7/site-packages/mwparserfromhell-0.2.dev-py2.7-linux-x86_64.egg/mwparserfromhell/wikicode.py", line 376, in filter_tags return list(self.ifilter_tags(recursive, matches, flags)) File "/home/zad68/.local/lib/python2.7/site-packages/mwparserfromhell-0.2.dev-py2.7-linux-x86_64.egg/mwparserfromhell/wikicode.py", line 301, in ifilter for node in nodes: File "/home/zad68/.local/lib/python2.7/site-packages/mwparserfromhell-0.2.dev-py2.7-linux-x86_64.egg/mwparserfromhell/wikicode.py", line 82, in _get_all_nodes for child in self._get_children(node): File "/home/zad68/.local/lib/python2.7/site-packages/mwparserfromhell-0.2.dev-py2.7-linux-x86_64.egg/mwparserfromhell/wikicode.py", line 59, in _get_children for context, child in node.__iternodes__(self._get_all_nodes): AttributeError: 'NoneType' object has no attribute '__iternodes__' # Edge case with self-closing tag only: >>> wikicode = Builder().build(Tokenizer().tokenize('<ref name=foo/>')) >>> wikicode.filter_tags() [] >>> wikicode.filter_text() [u'foo'] # If the tag isn't "ref", different but still incorrect behavior: # it doesn't stack trace but doesn't work either... >>> wikicode = Builder().build(Tokenizer().tokenize('I has<bloop name=baz/> a template!')) >>> wikicode.filter_tags() [] >>> wikicode.filter_tags(recursive=True) [] >>> |
Any questions let me know...
Zad
68
16:08, 14 March 2013 (UTC)
wikicode = Builder().build(Tokenizer().tokenize("==Epidemiology==\nFoo.<ref>hi<br />there</ref>")) # this looks OK: >>> wikicode.filter_tags() [u'<ref>hi<br />there</ref>'] # but doing it recursively yields slightly different stack trace >>> wikicode.filter_tags(recursive=True) Traceback (most recent call last): ... AttributeError: 'NoneType' object has no attribute 'nodes'
Check out:
>>> text = 'I has a template!\nfoo\n==bar==\n===baz===\nend' >>> wikicode = Builder().build(Tokenizer().tokenize(text)) >>> wikicode.get_sections() [u'I has a template!\nfoo\n', u'==bar==\n===baz===\nend', u'===baz===\nend']
Is that what I should be expecting? Cheers...
Zad
68
17:00, 14 March 2013 (UTC)
filter()
with the forcetype
parameter, like...>>> text = 'I has a template!\nfoo\n==bar==\n===baz===\nend'
>>> wikicode = mwparserfromhell.parse(text)
>>> wikicode.filter(forcetype=mwparserfromhell.nodes.Heading)
u'==bar==', u'===baz==='
The AFC statistics template the bot makes is so big that it is no longer usable.
In addition to the main template your bot generates, can you have the bot generate a version that is broken into parts, each no bigger than a few hundred entries? My personal preference would be one part for each section except "pending" and break "pending" up by day or week, with one part per day or week. To keep those who like to work on the biggest or smallest submissions, a separate page showing the 100 biggest and 100 smallest submissions would be useful.
The idea is that when the backlog is small, we can use the whole template, but when it is large, we can use the various parts. davidwr/( talk)/( contribs)/( e-mail) 02:47, 27 February 2013 (UTC)
Hi Earwig, I've converted some of the AFC statistics data to LUA templates, and it seems to work. A few caveats:
On the plus side, it does render the entire page, without the brokenness the current template displays. Might this be some way forward? You can find my testcase on User:Martijn_Hoekstra/templatetest (which might, as I said, take some time to load). (note to lurkers, this was created manually, and will NOT be updated as new articles are added/reviewed) Martijn Hoekstra ( talk) 11:10, 18 March 2013 (UTC)
Hello Earwig, just wanted to show you what my (first) goal is in using the mwparserfromhell libraries. The intent of the bot I'm building is to assist me, and any other Wikipedia editor who finds it useful, in getting a jump-start on doing GA reviews, and especially GA reviews of medical articles. As an example of what it'd look like, I ran my bot on
Alzheimer's disease (after some massaging to work around the few bugs mentioned above), and the output looks like
this. It pulls the Level-2 and Level-3 section headings because I like to make GA review notes section-by-section as I go through the article. It also uses the ref-tag processing to pull all the refs in the article into a Sources table for review. (I like to actually go through every source, verify it's
WP:RS, and do a lot of checking that the source is used properly.) As an additional helper, it uses the template processing to identify all the [v]cite journal templates, pulls the PMID for each one, and then goes to PubMed to pull the article's type and put it in the table - for medical articles we really insist on secondary sources like review articles and meta-analyses. The bot even handles the case where a single ref is bundled and has multiple journal templates with PMIDs. Just wanted to share, maybe solicit suggestions, and ... well, it's be great to get the issues fixed; when they are, I'm celebrating the pony way. Appreciate all you're doing with the libraries...
Zad
68
03:51, 19 March 2013 (UTC)
The AFC statistics template the bot makes is so big that it is no longer usable.
In addition to the main template your bot generates, can you have the bot generate a version that is broken into parts, each no bigger than a few hundred entries? My personal preference would be one part for each section except "pending" and break "pending" up by day or week, with one part per day or week. To keep those who like to work on the biggest or smallest submissions, a separate page showing the 100 biggest and 100 smallest submissions would be useful.
The idea is that when the backlog is small, we can use the whole template, but when it is large, we can use the various parts. davidwr/( talk)/( contribs)/( e-mail) 02:47, 27 February 2013 (UTC)
Hi Earwig, I've converted some of the AFC statistics data to LUA templates, and it seems to work. A few caveats:
On the plus side, it does render the entire page, without the brokenness the current template displays. Might this be some way forward? You can find my testcase on User:Martijn_Hoekstra/templatetest (which might, as I said, take some time to load). (note to lurkers, this was created manually, and will NOT be updated as new articles are added/reviewed) Martijn Hoekstra ( talk) 11:10, 18 March 2013 (UTC)
Hello Earwig, just wanted to show you what my (first) goal is in using the mwparserfromhell libraries. The intent of the bot I'm building is to assist me, and any other Wikipedia editor who finds it useful, in getting a jump-start on doing GA reviews, and especially GA reviews of medical articles. As an example of what it'd look like, I ran my bot on
Alzheimer's disease (after some massaging to work around the few bugs mentioned above), and the output looks like
this. It pulls the Level-2 and Level-3 section headings because I like to make GA review notes section-by-section as I go through the article. It also uses the ref-tag processing to pull all the refs in the article into a Sources table for review. (I like to actually go through every source, verify it's
WP:RS, and do a lot of checking that the source is used properly.) As an additional helper, it uses the template processing to identify all the [v]cite journal templates, pulls the PMID for each one, and then goes to PubMed to pull the article's type and put it in the table - for medical articles we really insist on secondary sources like review articles and meta-analyses. The bot even handles the case where a single ref is bundled and has multiple journal templates with PMIDs. Just wanted to share, maybe solicit suggestions, and ... well, it's be great to get the issues fixed; when they are, I'm celebrating the pony way. Appreciate all you're doing with the libraries...
Zad
68
03:51, 19 March 2013 (UTC)
I am writing because EarwigBot has not run task 2 (Articles for Creation statistics page/dashboard) In 3 days (14:22, 18 March 2013). AfC is currently in a backlog elimination drive and having to sort through already reviewed submissions that are not coming off the list is exceptionally annoying. If you could do a manual run of task 2 and verify that it successfully completes, that would be greatly appreciated. Hasteur ( talk) 13:30, 21 March 2013 (UTC)
Good morning!
I am need to posting page and am concerned that a page I am working on is about to be deleted. Any suggestions? Avewiki ( talk) 13:39, 27 March 2013 (UTC)
Articles for Creation urgently needs YOUR help!
Articles for Creation is desperately short of reviewers! We are looking for urgent help, from experienced editors, in reviewing submissions in the pending submissions queue. Currently there are 3329 submissions waiting to be reviewed and many help requests at our Help Desk.
If the answer to these questions is yes, then please read the reviewing instructions and donate a little of your time to helping tackle the backlog. You might wish to add {{ AFC status}} or {{ AfC Defcon}} to your userpage, which will alert you to the number of open submissions.
We would greatly appreciate your help. Currently, only a small handful of users are reviewing articles. Any help, even if it's just 2 or 3 reviews, it would be extremely beneficial. |
(comment to make MiszaBot archive this) — Earwig talk 03:18, 4 April 2013 (UTC)
Hi Earwig--is it possible to rename a template using mwparserfromhell
? I thought one could just use template.name('newname')
, but apparently that's not the case. —
Theopolisme (
talk) 11:19, 29 April 2013 (UTC)
.name
is not a function, it's an attribute, so you set it instead of calling it. Try template.name = 'newname'
. —
Earwig
talk 21:38, 29 April 2013 (UTC)
Hi! Could you look at my edits to Template:DRN case status and how EarwigBot undoes them? Am I doing something wrong?
BTW, my first attempt was to change the title on the main DRN page. In the past the bot has picked up on the change and updated the template, but that isn't happening either. -- Guy Macon ( talk) 06:19, 2 May 2013 (UTC)
In three cases that I identified as heavy copyvios by hand/eye (now hidden from the reader, but still present in the source), the copyvio detector reports between 40 and 50% of confidence, and therefore claims "No violations detected" in a happy green box. [4] [5] [6] (see especially the details). I suggest to rather use a yellow box with "there are hints of copying" with scores as high as those. Also, calculating confidence on a sentence or paragraph level would be helpful to get a more distinct score. Would it be a problem if the tool would be used automatically to scan pages for copyvios (sequentially, of course)? (The response times lets me guess that it might be resource-intensive.) -- Mopskatze ( talk) 17:26, 4 May 2013 (UTC)
Hi Earwig, remember this? One more character to add to the list of problematic characters in parsing "ref name=...": Ampersand.
Any progress here? I'm still very interested in using mwparserfromhell to make a ref-processing bot, if there's anything I can do to help you debug or test please let me know.
Zad
68
14:09, 3 May 2013 (UTC)
One more to look at: Parsing <div class="references-small"> fails as well, same reason I guess - because the parameter value has a hyphen in it, so that issue is not just limited to ref tag names. Thanks...
Zad
68
20:41, 6 May 2013 (UTC)
This page is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
There seems to be some problems with the notification of users named on DRN cases:
Wikipedia talk:Dispute resolution noticeboard#Autonotification
Wikipedia talk:Dispute resolution noticeboard#Autonotification
-- Guy Macon ( talk) 01:18, 14 November 2012 (UTC)
I thought that you deserved something a bit extra for all of the amazing work you've done for the project.
I've nominated you for a gift from the Wikimedia Foundation! |
Legoktm ( talk) 20:54, 14 November 2012 (UTC)
Your bot User:EarwigBot is doing a fine job in maintaining Template:AFC statistics, but there are a small number of entries there that shouldn't IMO be there, but that the bot adds nevertheless. I am not sure of the bot code needs tweaking, or if anything needs to be done about these pages instead. The pages I mean are the ones I manually removed here. This includse pages like Inch Parish, Wigtownshire, a redirect where neither the source nor the target seem to have any AfC categories. Can you take a look? It's not urgent, and there is no reason to stop this bot task obviously, but the lists are very long and removing a few that have no place there would make it a bit lighter. Fram ( talk) 10:20, 13 November 2012 (UTC)
Template:AFC statistics is no longer working, exceeds limits, and gives only one of the four sections anymore. Fram ( talk) 16:16, 22 November 2012 (UTC)
...from Category:Undated AfC submissions? — Wylie Coyote 13:21, 28 November 2012 (UTC)
Articles for creation is desperately short of reviewers! We are looking for urgent help, from experienced editors, in reviewing submissions in the pending submissions queue. Currently there are 3329 submissions waiting to be reviewed and many help requests at our help desk.
If the answer to these questions is yes, then please read the
reviewing instructions and donate a little of your time to helping tackle the backlog. You might wish to add {{
AFC status}} or {{
AfC Defcon}} to your userpage, which will alert you to the number of open submissions.
Plus, reviewing is easy when you use our new semi-automated
reviewing script!
|
Hi. Thank you for your recent edits. Wikipedia appreciates your help. We noticed though that when you edited Digite, Inc., you added a link pointing to the disambiguation page Mountain View ( check to confirm | fix with Dab solver). Such links are almost always unintended, since a disambiguation page is merely a list of "Did you mean..." article titles. Read the FAQ • Join us at the DPL WikiProject.
It's OK to remove this message. Also, to stop receiving these messages, follow these opt-out instructions. Thanks, DPL bot ( talk) 10:55, 30 November 2012 (UTC)
The Earwig: I was recently creating a page on Center for Hispanic Leadership and it was deleted today due to ambiguous advertising/promotion. I realize now that I had inappropriate information in the entry and completely agree with the deletion. That being said, I would like to create the page again listing only information about the Center for Hispanic Leadership (CHL), the Hispanic Training Center and the CHL Chapters. I plan to refer to the CHL Founder, Glenn Llopis, who has a page on Wikipedia already, just once as a point of reference and then list reputable articles and references at the end of the page.
Again, I wanted to reach out and acknowledge my misstep and would like to be afforded the opportunity to create a CHL page that is in-line with Wikipedia's guidelines.
Please let me know your feedback.
Kind regards, Marisa Salcines — Preceding unsigned comment added by Gabri7elle ( talk • contribs) 04:25, 3 December 2012 (UTC)
Hi, I have created a new entry for Center for Hispanic Leadership. Please let me know if it meets the standards.
Thanks 06:21, 4 December 2012 (UTC) — Preceding unsigned comment added by Gabri7elle ( talk • contribs)
Apparently your reviewed Template:Derry and passed it. I feel that you have called your competence as a reviewer into question due to your failure to properly flag the blatant bias in the template whilst going on to add it to several articles. The overt nationalist/republican bias of the template is plain to see straight from the off and until it is fixed it should not be added to articles. Mabuska (talk) 14:41, 4 December 2012 (UTC)
Could you please check if your copyvio detector is still working? I am getting the following error message:
Error message
|
---|
Error ! SiteNotFoundError: Site 'all' not found in the sitesdb. <%include file="/support/header.mako" args="environ=environ, cookies=cookies, title='Copyvio Detector', add_css=('copyvios.css',), add_js=('copyvios.js',)"/>\ <%namespace module="toolserver.copyvios" import="main, highlight_delta"/>\ <%namespace module="toolserver.misc" import="urlstrip"/>\ <% query, bot, all_langs, all_projects, page, result = main(environ) %>\ % if query.project and query.lang and query.title and not page:The given site (project=${query.project | h}, language=${query.lang | h}) doesn't seem to exist. It may also be closed or private. <a href="https://${query.lang | h}.${query.project | h}.org/">Confirm its URL.</a> /home/earwig/git/earwigbot/earwigbot/wiki/sitesdb.py, line 159: raise SiteNotFoundError(error) /home/earwig/git/earwigbot/earwigbot/wiki/sitesdb.py, line 186: namespaces) = self._load_site_from_sitesdb(name) /home/earwig/git/earwigbot/earwigbot/wiki/sitesdb.py, line 135: site = self._make_site_object(name) /home/earwig/git/earwigbot/earwigbot/wiki/sitesdb.py, line 340: return self._get_site_object(name) /home/earwig/git/earwigbot/earwigbot/wiki/copyvios/exclusions.py, line 112: site = self._sitesdb.get_site(sitename) /home/earwig/git/earwigbot/earwigbot/wiki/copyvios/exclusions.py, line 149: self._update(sitename) /home/earwig/git/earwigbot/earwigbot/wiki/copyvios/exclusions.py, line 154: self.sync("all") /home/earwig/git/earwigbot/earwigbot/wiki/copyvios/__init__.py, line 146: self._exclusions_db.sync(self.site.name) ./toolserver/copyvios/checker.py, line 28: result = page.copyvio_check(max_queries=10, max_time=45) ./toolserver/copyvios/__init__.py, line 23: page, result = get_results(bot, site, query) pages/copyvios.mako, line 4: <% query, bot, all_langs, all_projects, page, result = main(environ) %>\ /home/earwig/.local/solaris/lib/python2.7/site-packages/Mako-0.7.2-py2.7.egg/mako/runtime.py, line 817: callable_(context, *args, **kwargs) |
By the way, thank you for creating the tool! It's very useful – I have been using it when reviewing AfC submissions. The Anonymouse ( talk • contribs) 16:42, 5 December 2012 (UTC)
Why did u decline my article?? — Preceding unsigned comment added by 66.87.97.103 ( talk) 01:11, 12 December 2012 (UTC)
The WikiProject Articles for creation newsletter | ||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Hi, so at WT:DRN, there is a discussion on making subpages for each case. (Sort of a WP:SPI style) Would you modify the bot if the proposal succeeds (very likely to succeed). If you will, how long would it take?
Copied from Steven Zhang to explain SPI style:
Essentially, how this works, is essentially, instead of each dispute being a thread on the one page, each dispute would have it's own page that is created by the filer, very similar to the format of WP:SPI. When a dispute is closed (as resolved or otherwise), it's archived, and can be easily referred back to if a dispute is filed again. Potential positives with the change is a more organised format, and easier to look back on past discussions. Negatives include the loss of all cases easily viewable on a watchlist, [ ... ], and criticism of increased bureaucracy.
~~ Ebe 123~~ → report 14:43, 28 December 2012 (UTC)
I have found a mirror while using Earwigbot's copyvio detector -[wpedia.goo.ne.jp/]]. Can you add it to the ignore list please? (great tool BTW!) Mdann52 ( talk) 14:00, 17 January 2013 (UTC)
So, I think there's something wrong with the copyvio detector. I ran [1] and got "Jatin Seth is a suspected violation of en.wikipedia.org/wiki/Jatin_Seth." I'm pretty sure that's not supposed to happen. — Darkwind ( talk) 03:24, 19 January 2013 (UTC)
See Wikipedia talk:WikiProject Articles for creation/2014 5#Template limit means backlog list doesn't show up in /Submissions for discussion. davidwr/( talk)/( contribs)/( e-mail) 22:37, 21 January 2013 (UTC)
Hi. I know I have a global account, but your tool says otherwise. Also, if I fail to enter a username (which it claims is not required), it crashes completely. Just checking if you were aware of these issues (or if the tool is supposed to work at all!) Mdann52 ( talk) 13:21, 29 January 2013 (UTC)
I've noticed the bot hasn't updated the AFC statistics page in two days. Was wondering if this was done on purpose or it's just that it can't handle the backlog. Funny Pika! 03:11, 9 February 2013 (UTC)
Hi. I just wanted to drop you a quick note to say that I read mwparserfromhell's README the other day and found it to be some of the best documentation I'd ever read. Very nicely done. :-) -- MZMcBride ( talk) 04:31, 13 February 2013 (UTC)
Hello! In case you were unaware, the Template:AFC statistics/sandbox hasn't been updating for some time. The Template:AFC statistics seems broken; since they are connected, maybe it's all part of the same problem. Thanks for creating these lists; I find them very helpful. — Anne Delong ( talk) 13:18, 18 February 2013 (UTC)
I've just removed an extra template:DRN archive top from a closed thread at WP:DRN. This is the second one that's turned up recently. After a little look through the page history, I discovered the culprit: EarwigBot! ( [2] and [3].) It seems to only do it when the 'do not archive until' comment isn't removed, so it can be prevented by always remembering to remove the comment when closing, but the bot clearly isn't working properly as no bottom template is added and even if it was, the template is being added to already collapsed threads, making it pointless. CarrieVS ( talk) 15:04, 21 February 2013 (UTC)
Hi Earwig:
I am asking what I (actually my student) needs to do to improve Thomas E. Emerson's Wikipedia page that was declined recently. I see the request for independent references is one thing. http://en.wikipedia.org/wiki/Wikipedia_talk:Articles_for_creation/Thomas_E._Emerson
Dr. Emerson is one of the most famous archaeologists doing Eastern North American prehistory, the author of numerous well received books, edited volumes, and papers. I am more than willing to edit the page and organize it better, and am wondering how to proceed to meet your concerns about independent references. In our field we have book reviews that are published in academic journals, and some of those could be cited. The books themselves that he wrote are published, and could also be referenced.
FYI, I am a professor of archaeology at the University of Tennessee, and I have my students write articles for Wikipedia on a famous archaeologist, archaeological site, or archaeological project every time I teach an advanced undergraduate class in North American archaeology, provided there is no previously published article on the subject on Wikipedia.
I am not as proficient at Wikipedia as I could be, but believe it is a critically important reference tool, which is why I support it and try to build its intellectual content. My students have posted >100 articles over the past 5 years. I am always puzzled why some articles go up with little or no comment, while others have more problems. I appreciate what you all do, and just want to learn to do it better.
Feel free to email me back if you want... dander19@utk.edu
Thanks! 160.36.65.208 ( talk) 21:16, 24 February 2013 (UTC) David
David G. Anderson, Ph.D., RPA Professor and Associate Head Department of Anthropology The University of Tennessee 250 South Stadium Hall Knoxville, Tennessee 37996-0720 dander19@utk.edu http://web.utk.edu/~anthrop/faculty/anderson.html http://pidba.tennessee.edu/ http://bellsbend.pidba.org/
WikiProject AFC is holding a one month long Backlog Elimination Drive!
The goal of this drive is to eliminate the backlog of unreviewed articles. The drive is running from March 1st, 2013 – March 31st, 2013.
Awards will be given out for all reviewers participating in the drive in the form of barnstars at the end of the drive.
There is a backlog of over 2000 articles, so start reviewing articles! Visit the
drive's page and help out!
Delivered by User:EdwardsBot on behalf of Wikiproject Articles for Creation at 13:54, 27 February 2013 (UTC)
March 2, 2013
Hello Earwig,
I am writing to you because you have given me pointers about how to improve an article about the conductor and producer Peter Tiboris. I took your advice, reworked the article, and resubmitted it about three weeks ago. The article has been rejected again, by an admittedy inexperienced editor, due to lack of evidence of the subject's notability. This is the exact same rejection form used previously. Before resubmitting the article, I added numerous citations of articles and reviews that appeared in <The New York Times/>, <The New Yorker/>, and other well-known and verifiable sources.
When I click on the link to edit the rejected article, I am taken to an article about a rap singer. I don't quite understand this connection. Mr. Tiboris is well known in the music industry as a classical music conductor and a producer of concerts. He has conducted orchestras in 20 countries and produced 1200 concerts throughout the world, primarily in New York's Carnegie Hall, over a 30-year period.
I don't know what to do next which is why I am writing to you.
Many thanks for your suggestions about Article: Peter Tiboris.
Sincerely,
Dale Zeidman Dzeidman ( talk) 01:03, 3 March 2013 (UTC)
Hello, you seem to be responsible for mwparserfromhell development, so hope you don't mind me asking: I posted a question at
WP:BON regarding parsing out ref tags with mwparserfromhell v. 0.1.1. Doesn't seem to be working for me, should I be expecting it to? Appreciate your input, cheers...
Zad
68
15:49, 13 March 2013 (UTC)
Zad
68
00:15, 14 March 2013 (UTC)
Zad
68
01:02, 14 March 2013 (UTC)
{{foo|bar}}
gets converted into the tokens [TemplateOpen(), Text(text="foo"), TemplateParamSeparator(), Text(text="bar"), TemplateClose()]
, which then gets converted into a Template
object with the data stored within it. The series of objects that make up the wikicode are then wrapped in a Wikicode
object, which has methods like filter_templates()
. The entire process is a lot more complex than just regex, because regex is prone to catastrophic failures when the input is not exactly what it expects it to be, whereas a tokenizer can handle confusing cases, like nested templates and wikicode that looks like a template but actually isn't because there's an invalid character in the template's name. The regex necessary for that to work properly would be far too complex, and probably impossible. —
Earwig
talk 01:38, 14 March 2013 (UTC)
Zad
68
02:37, 14 March 2013 (UTC)
So after installing git, the Python-dev libraries (needed for Python.h because it looks like it's doing some C compiling), and doing a little reading, I got as far as getting the dev mwparserfromhell lib downloaded, installed locally and built, and it looks like I'm using it, but processing behavior is same as before, the ref tags aren't parsed out:
Extended content
|
---|
$ git clone -b feature/html_tags git://github.com/earwig/mwparserfromhell.git ... $ python setup.py install --user running install running bdist_egg running egg_info ... Adding mwparserfromhell 0.2.dev to easy-install.pth file Installed /home/zad68/.local/lib/python2.7/site-packages/mwparserfromhell-0.2.dev-py2.7-linux-x86_64.egg Processing dependencies for mwparserfromhell==0.2.dev Finished processing dependencies for mwparserfromhell==0.2.dev $ python Python 2.7.3 (default, Aug 1 2012, 05:14:39) [GCC 4.6.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import mwparserfromhell >>> mwparserfromhell.__version__ u'0.2.dev' >>> mwparserfromhell.__file__ '/home/zad68/.local/lib/python2.7/site-packages/mwparserfromhell-0.2.dev-py2.7-linux-x86_64.egg/mwparserfromhell/__init__.pyc' >>> text = "I has a template!<ref>{{foo|bar|baz|eggs=spam}}</ref> See it?" >>> wikicode = mwparserfromhell.parse(text) >>> wikicode.filter_templates() [u'{{foo|bar|baz|eggs=spam}}'] >>> wikicode.filter_tags() [] >>> wikicode.filter_text() [u'I has a template!<ref>', u'</ref> See it?'] |
Is there some development flag or something I have to enable or something to get it to parse the tags? Hope you don't mind me stinking up your Talk page with this, if you'd rather do this somewhere else let me know. Also my email is enabled, that's good for me too. Any help/direction appreciated, cheers...
Zad
68
03:27, 14 March 2013 (UTC)
Maybe this has something to do with where you said You would also need to explicitly use the Python tokenizer instead of the C extension
but I'm not sure how to do that.
Zad
68
03:34, 14 March 2013 (UTC)
>>> import mwparserfromhell
>>> from mwparserfromhell.parser.tokenizer import Tokenizer
>>> from mwparserfromhell.parser.builder import Builder
>>> text = "I has a template!<ref>{{foo|bar|baz|eggs=spam}}</ref> See it?"
>>> wikicode = Builder().build(Tokenizer().tokenize(text))
>>> wikicode.filter_templates(recursive=True) # 'recursive' needed because template is nested inside tag
u'{{foo|bar|baz|eggs=spam}}'
>>> wikicode.filter_tags()
u'<ref>{{foo|bar|baz|eggs=spam}}</ref>'
>>> tag = wikicode.filter_tags()[0
>>> tag.tag
u'ref'
>>> tag.type == tag.TAG_REF
True
>>> tag.contents
u'{{foo|bar|baz|eggs=spam}}'
>>> tag.contents.filter_templates()[0.name
u'foo'
>>> tag.contents.filter_templates()[0.params
u'bar', u'baz', u'eggs=spam'
Hope that helps. Oh, and my talk page is fine for this sort of discussion. — Earwig talk 03:59, 14 March 2013 (UTC)
Zad
68
12:56, 14 March 2013 (UTC)git pull
-ing the repository):>>> import mwparserfromhell
>>> mwparserfromhell.parser.use_c = False
>>> wikicode = mwparserfromhell.parse(text)
Hey Earwig, try testing this:
wikicode = Builder().build(Tokenizer().tokenize('<ref name="a-b">'))
Error I get is: in my local tokenizer.py, line 472, in _actually_close_tag_opening:
if isinstance(self._stack[-1], tokens.TagAttrStart): IndexError: list index out of range
Only seems to occur when: 1) It's a ref tag, 2) name parameter is specified and has a value with certain characters in it, like - (hyphen) or = (equals), 3) the name parameter value is in double-quote. Bug?
Zad
68
14:10, 14 March 2013 (UTC)
Self-closing tags don't seem to be handled properly:
Extended content
|
---|
$ python Python 2.7.3 (default, Aug 1 2012, 05:14:39) [GCC 4.6.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import mwparserfromhell from mwparserfromhell.parser.tokenizer import Tokenizer from mwparserfromhell.parser.builder import Builder >>> >>> >>> # Without self-closing ref tag, works >>> wikicode = Builder().build(Tokenizer().tokenize('I has a template!<ref name=foo>{{bar}}</ref>')) >>> wikicode.filter_tags() [u'<ref name=foo>{{bar}}</ref>'] >>> wikicode.filter_tags(recursive=True) [u'<ref name=foo>{{bar}}</ref>'] # With self-closing tag, doesn't work >>> wikicode = Builder().build(Tokenizer().tokenize('I has a template!<ref name=foo>{{bar}}</ref><ref name=baz/>')) >>> wikicode.filter_tags() [] >>> wikicode.filter_text() [u'baz'] >>> wikicode.filter_tags(recursive=True) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/zad68/.local/lib/python2.7/site-packages/mwparserfromhell-0.2.dev-py2.7-linux-x86_64.egg/mwparserfromhell/wikicode.py", line 376, in filter_tags return list(self.ifilter_tags(recursive, matches, flags)) File "/home/zad68/.local/lib/python2.7/site-packages/mwparserfromhell-0.2.dev-py2.7-linux-x86_64.egg/mwparserfromhell/wikicode.py", line 301, in ifilter for node in nodes: File "/home/zad68/.local/lib/python2.7/site-packages/mwparserfromhell-0.2.dev-py2.7-linux-x86_64.egg/mwparserfromhell/wikicode.py", line 82, in _get_all_nodes for child in self._get_children(node): File "/home/zad68/.local/lib/python2.7/site-packages/mwparserfromhell-0.2.dev-py2.7-linux-x86_64.egg/mwparserfromhell/wikicode.py", line 59, in _get_children for context, child in node.__iternodes__(self._get_all_nodes): AttributeError: 'NoneType' object has no attribute '__iternodes__' # Edge case with self-closing tag only: >>> wikicode = Builder().build(Tokenizer().tokenize('<ref name=foo/>')) >>> wikicode.filter_tags() [] >>> wikicode.filter_text() [u'foo'] # If the tag isn't "ref", different but still incorrect behavior: # it doesn't stack trace but doesn't work either... >>> wikicode = Builder().build(Tokenizer().tokenize('I has<bloop name=baz/> a template!')) >>> wikicode.filter_tags() [] >>> wikicode.filter_tags(recursive=True) [] >>> |
Any questions let me know...
Zad
68
16:08, 14 March 2013 (UTC)
wikicode = Builder().build(Tokenizer().tokenize("==Epidemiology==\nFoo.<ref>hi<br />there</ref>")) # this looks OK: >>> wikicode.filter_tags() [u'<ref>hi<br />there</ref>'] # but doing it recursively yields slightly different stack trace >>> wikicode.filter_tags(recursive=True) Traceback (most recent call last): ... AttributeError: 'NoneType' object has no attribute 'nodes'
Check out:
>>> text = 'I has a template!\nfoo\n==bar==\n===baz===\nend' >>> wikicode = Builder().build(Tokenizer().tokenize(text)) >>> wikicode.get_sections() [u'I has a template!\nfoo\n', u'==bar==\n===baz===\nend', u'===baz===\nend']
Is that what I should be expecting? Cheers...
Zad
68
17:00, 14 March 2013 (UTC)
filter()
with the forcetype
parameter, like...>>> text = 'I has a template!\nfoo\n==bar==\n===baz===\nend'
>>> wikicode = mwparserfromhell.parse(text)
>>> wikicode.filter(forcetype=mwparserfromhell.nodes.Heading)
u'==bar==', u'===baz==='
The AFC statistics template the bot makes is so big that it is no longer usable.
In addition to the main template your bot generates, can you have the bot generate a version that is broken into parts, each no bigger than a few hundred entries? My personal preference would be one part for each section except "pending" and break "pending" up by day or week, with one part per day or week. To keep those who like to work on the biggest or smallest submissions, a separate page showing the 100 biggest and 100 smallest submissions would be useful.
The idea is that when the backlog is small, we can use the whole template, but when it is large, we can use the various parts. davidwr/( talk)/( contribs)/( e-mail) 02:47, 27 February 2013 (UTC)
Hi Earwig, I've converted some of the AFC statistics data to LUA templates, and it seems to work. A few caveats:
On the plus side, it does render the entire page, without the brokenness the current template displays. Might this be some way forward? You can find my testcase on User:Martijn_Hoekstra/templatetest (which might, as I said, take some time to load). (note to lurkers, this was created manually, and will NOT be updated as new articles are added/reviewed) Martijn Hoekstra ( talk) 11:10, 18 March 2013 (UTC)
Hello Earwig, just wanted to show you what my (first) goal is in using the mwparserfromhell libraries. The intent of the bot I'm building is to assist me, and any other Wikipedia editor who finds it useful, in getting a jump-start on doing GA reviews, and especially GA reviews of medical articles. As an example of what it'd look like, I ran my bot on
Alzheimer's disease (after some massaging to work around the few bugs mentioned above), and the output looks like
this. It pulls the Level-2 and Level-3 section headings because I like to make GA review notes section-by-section as I go through the article. It also uses the ref-tag processing to pull all the refs in the article into a Sources table for review. (I like to actually go through every source, verify it's
WP:RS, and do a lot of checking that the source is used properly.) As an additional helper, it uses the template processing to identify all the [v]cite journal templates, pulls the PMID for each one, and then goes to PubMed to pull the article's type and put it in the table - for medical articles we really insist on secondary sources like review articles and meta-analyses. The bot even handles the case where a single ref is bundled and has multiple journal templates with PMIDs. Just wanted to share, maybe solicit suggestions, and ... well, it's be great to get the issues fixed; when they are, I'm celebrating the pony way. Appreciate all you're doing with the libraries...
Zad
68
03:51, 19 March 2013 (UTC)
The AFC statistics template the bot makes is so big that it is no longer usable.
In addition to the main template your bot generates, can you have the bot generate a version that is broken into parts, each no bigger than a few hundred entries? My personal preference would be one part for each section except "pending" and break "pending" up by day or week, with one part per day or week. To keep those who like to work on the biggest or smallest submissions, a separate page showing the 100 biggest and 100 smallest submissions would be useful.
The idea is that when the backlog is small, we can use the whole template, but when it is large, we can use the various parts. davidwr/( talk)/( contribs)/( e-mail) 02:47, 27 February 2013 (UTC)
Hi Earwig, I've converted some of the AFC statistics data to LUA templates, and it seems to work. A few caveats:
On the plus side, it does render the entire page, without the brokenness the current template displays. Might this be some way forward? You can find my testcase on User:Martijn_Hoekstra/templatetest (which might, as I said, take some time to load). (note to lurkers, this was created manually, and will NOT be updated as new articles are added/reviewed) Martijn Hoekstra ( talk) 11:10, 18 March 2013 (UTC)
Hello Earwig, just wanted to show you what my (first) goal is in using the mwparserfromhell libraries. The intent of the bot I'm building is to assist me, and any other Wikipedia editor who finds it useful, in getting a jump-start on doing GA reviews, and especially GA reviews of medical articles. As an example of what it'd look like, I ran my bot on
Alzheimer's disease (after some massaging to work around the few bugs mentioned above), and the output looks like
this. It pulls the Level-2 and Level-3 section headings because I like to make GA review notes section-by-section as I go through the article. It also uses the ref-tag processing to pull all the refs in the article into a Sources table for review. (I like to actually go through every source, verify it's
WP:RS, and do a lot of checking that the source is used properly.) As an additional helper, it uses the template processing to identify all the [v]cite journal templates, pulls the PMID for each one, and then goes to PubMed to pull the article's type and put it in the table - for medical articles we really insist on secondary sources like review articles and meta-analyses. The bot even handles the case where a single ref is bundled and has multiple journal templates with PMIDs. Just wanted to share, maybe solicit suggestions, and ... well, it's be great to get the issues fixed; when they are, I'm celebrating the pony way. Appreciate all you're doing with the libraries...
Zad
68
03:51, 19 March 2013 (UTC)
I am writing because EarwigBot has not run task 2 (Articles for Creation statistics page/dashboard) In 3 days (14:22, 18 March 2013). AfC is currently in a backlog elimination drive and having to sort through already reviewed submissions that are not coming off the list is exceptionally annoying. If you could do a manual run of task 2 and verify that it successfully completes, that would be greatly appreciated. Hasteur ( talk) 13:30, 21 March 2013 (UTC)
Good morning!
I am need to posting page and am concerned that a page I am working on is about to be deleted. Any suggestions? Avewiki ( talk) 13:39, 27 March 2013 (UTC)
Articles for Creation urgently needs YOUR help!
Articles for Creation is desperately short of reviewers! We are looking for urgent help, from experienced editors, in reviewing submissions in the pending submissions queue. Currently there are 3329 submissions waiting to be reviewed and many help requests at our Help Desk.
If the answer to these questions is yes, then please read the reviewing instructions and donate a little of your time to helping tackle the backlog. You might wish to add {{ AFC status}} or {{ AfC Defcon}} to your userpage, which will alert you to the number of open submissions.
We would greatly appreciate your help. Currently, only a small handful of users are reviewing articles. Any help, even if it's just 2 or 3 reviews, it would be extremely beneficial. |
(comment to make MiszaBot archive this) — Earwig talk 03:18, 4 April 2013 (UTC)
Hi Earwig--is it possible to rename a template using mwparserfromhell
? I thought one could just use template.name('newname')
, but apparently that's not the case. —
Theopolisme (
talk) 11:19, 29 April 2013 (UTC)
.name
is not a function, it's an attribute, so you set it instead of calling it. Try template.name = 'newname'
. —
Earwig
talk 21:38, 29 April 2013 (UTC)
Hi! Could you look at my edits to Template:DRN case status and how EarwigBot undoes them? Am I doing something wrong?
BTW, my first attempt was to change the title on the main DRN page. In the past the bot has picked up on the change and updated the template, but that isn't happening either. -- Guy Macon ( talk) 06:19, 2 May 2013 (UTC)
In three cases that I identified as heavy copyvios by hand/eye (now hidden from the reader, but still present in the source), the copyvio detector reports between 40 and 50% of confidence, and therefore claims "No violations detected" in a happy green box. [4] [5] [6] (see especially the details). I suggest to rather use a yellow box with "there are hints of copying" with scores as high as those. Also, calculating confidence on a sentence or paragraph level would be helpful to get a more distinct score. Would it be a problem if the tool would be used automatically to scan pages for copyvios (sequentially, of course)? (The response times lets me guess that it might be resource-intensive.) -- Mopskatze ( talk) 17:26, 4 May 2013 (UTC)
Hi Earwig, remember this? One more character to add to the list of problematic characters in parsing "ref name=...": Ampersand.
Any progress here? I'm still very interested in using mwparserfromhell to make a ref-processing bot, if there's anything I can do to help you debug or test please let me know.
Zad
68
14:09, 3 May 2013 (UTC)
One more to look at: Parsing <div class="references-small"> fails as well, same reason I guess - because the parameter value has a hyphen in it, so that issue is not just limited to ref tag names. Thanks...
Zad
68
20:41, 6 May 2013 (UTC)