This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 75 | ← | Archive 78 | Archive 79 | Archive 80 | Archive 81 | Archive 82 | → | Archive 85 |
A recent edit at Dave Grohl has produced "Lua error in Module:TwitterSnowflake at line 16: attempt to perform arithmetic on local 'c' (a string value)." That is seen by previewing the following.
{{cite tweet |author=Foo Fighters |title=Example |user=foofighters |number=1026546600946982912/video/1 |access-date=August 10, 2018 }}
The problem is due to "/video/1" in the number parameter and is easily fixed, but perhaps the module could show that number is invalid. Johnuniq ( talk) 02:43, 25 October 2021 (UTC)
There are usurped URLs, we also have usurped titles. See
here 38 domains have been usurped by a gambling site, then ReFill or Citation bot add a missing |title=
pulled from the gambling site - d'oh. My bot can deal with the usurped URLs but what about the titles: delete the title, or replacing with a place holder? If it is deleted, it won't stop other tools from re-adding the usurped title again. We could notify tool makers, but there is no guarantee they will implement, or future tools will be created, there is also
global wikis with the same problem. IMO a placeholder title (eg. |title=Usurped title
) that enters a tracking category with a help message would lock up the title field from usurpation until someone can manually add a working title (if one can be found). --
Green
C 19:18, 20 October 2021 (UTC)
|url-status=usurped
- but it does not determine a working title. It can't just leave a spam usurped title, it needs to do something. The question is: What? --
Green
C 21:41, 20 October 2021 (UTC)
|title=
(My bot can deal with the usurped URLs but what about the titles). But here you are saying that your bot
can't just leave a spam usurped title, it needs to do something. But if your bot
does not determine a working titlehow can it know that the title in
|title=
is a spam usurped title? Too many 'buts'.
|title=Archived copy
; only a handful of gnomes who have enabled maintenance messaging see the maintenance messages associated with
Category:CS1 maint: archived copy as title (54,351).|title=Usurped title
. The keywords in the title is actually how the 38+ domains where first identified as being usurped. --
Green
C 01:49, 21 October 2021 (UTC)
Usurped title
(case insensitive) to the list of generic/bogus titles in the sandbox:
My bot adds archive URLs and toggles |url-status=usurped
- but it does not determine a working title.
Setting |url-status=usurped
masks the original (usurped) |url=
in the citation's rendering when |archive-url=
has a value:
|title=
value but can recognize a usurped title and replace that title with a value that will cause cs1|2 to emit an error message and category, a human or a 'title-finding' bot can make a repair.|url-status=usurped
? I can hard set on enwiki fairly soon, on other wikis it could be a long time if ever. --
Green
C 16:34, 21 October 2021 (UTC):
|url=
values? Isn't that what edit filters are supposed to be? Of course, if these urls get added to an edit filter, all of a sudden editors won't be able to publish other needed article changes until they do something about the blacklisted url. I don't have a lot of experience with edit filters, but I recall being stymied when I couldn't discover from the message just which one of dozens or more urls was the one that prevented publishing the article. Does anyone know if that has improved and the can't-publish-this-page-because-it-has-a-banned-url error message contains a clue about which url(s) triggered the edit filter? If there has been that improvement, then cs1|2 should, I think, stay out of it and let the edit filters do their jobs.|url-status=usurped
. The problem with 2 globally is no bot exists for the indefinite future. It's actually quite difficult to usurper a domain: add archive URLs, flip the url-status, undo {{
webarchive}}
and convert to straight archive, remove entirely citations that have no archive, convert bare/square links to archive. To do it globally is a major undertaking due to the language and template differences. It would help, though be incomplete, if CS1|2 could detect from a list of domains and display as-if |url-status=usurped
. --
Green
C 05:15, 22 October 2021 (UTC)
|url-status=usurped
when such a url is detected. Setting |url-status=usurped
does nothing when |archive-url=
is empty or missing so special code would need to be written to suppress the url when the internal |url-status=usurped
is set. I fear that once added to a blacklist, urls will never be deleted so the list will grow until sometime down the road the system collapses because the size of the blacklist will push some articles over the lua memory-use or execution-time limits.|url-status=usurped
or adding an archive URL - the edit filter blocks the corrective. They are left to delete the cite or URL. Since every domain dies eventually , and some percentage of those will be hijacked, long-term it is a problem so trying to explore other solutions. Bots are the best answer, also the hardest. --
Green
C 05:16, 23 October 2021 (UTC)Pseudocode proposal (usurped routine):
Is url-status=usurped?
Y
Is archive-url=[non-empty]?
Y
url=archive-url
via=archive service name
Is title usurped?
Y
title=[original title from archive]
exit (usurped routine)
N (archive-url=empty)
Is cite web?
N
Is title usurped?
N
hide span url url-access access-date format via
exit (usurped routine)
Y (cite web)
flag delete/replace
hide cite span
exit
104.247.55.106 ( talk) 01:55, 23 October 2021 (UTC)
Is anyone aware of a Zotero plugin for Firefox?
I'm aware that Zotero allows for exporting of references into CS1, but it does so by producing a text file with the code. I was hoping there might be some plugin that let me insert it in browser, much like you can insert Zotero references in MS Word.
Cheers. Kylesenior ( talk) 05:25, 24 October 2021 (UTC)
Case citations work pretty much like {{ cite book}}. I've linked an example of what I'm proposing above which was done in my sandbox. You can see the result right here. Obviously, things can be expanded later as far as features are concerned, but for now I would like to get thoughts on maybe adding this to the primary CS1-suite of templates. – MJL ‐Talk‐ ☖ 01:04, 20 October 2021 (UTC)
{{
cite case}}
template, create a wrapper template around {{
cite book}}
. Using
Module:template wrapper make all {{cite book}}
parameters available and allows you to preset |type=Court case
and any other parameters that should hold default or calculated values.|vol=
get held over as an alias. This would make {{cite court}} the only CS1/CS2-style citation template that has that alias. That is not even getting into the fact |lang=
, |via=
, |url-status=
, |archive-url=
, |archive-date=
, |pinpoint=
are all inconsistent with all other citation templates. –
MJL
‐Talk‐
☖ 15:13, 23 October 2021 (UTC)
|vol=
and all of its current parameters, add support for parameters like |archive-url=
very easily, and add CS1's error-checking and other features. –
Jonesey95 (
talk) 19:36, 23 October 2021 (UTC)
|vol=
to be supported as an alias when the underlying templates don't. |vol=
isn't like |litigants=
which is specific to this use case, but it is just a general way one could refer to volume.In Wayback_Machine#cite_ref-DigitalJournal_31-0 .. a generic title error on the phrase 'Wayback machine' is actually a legit part of the title. -- Green C 00:58, 29 October 2021 (UTC)
|website=
)
{{cite web |first=Alexander |last=Baron |title=((The new Internet Archive Wayback Machine now online)) |url=http://www.digitaljournal.com/article/360776 |website=Digital Journal |date=October 23, 2013 |access-date=November 19, 2020 |archive-date=November 19, 2020 |archive-url=https://web.archive.org/web/20201119071411/http://www.digitaljournal.com/article/360776}}
Are there any plans to link the citation templates with Wikidata? I was thinking of 2-way connections:
To confirm practicality, I made a very crude template at User:Aymatth2/citeQ to pull values from Wikidata. There is no error checking, but it seems to work:
Code | Renders |
{{User:Aymatth2/citeQ |Q25169 |page=123}} | Douglas Adams, Eoin Colfer (1979), The Hitchhiker's Guide to the Galaxy, p. 123 |
{{User:Aymatth2/citeQ |Q4386569 |page=34}} | Beatrix Potter (October 1903), The Tailor of Gloucester, Frederick Warne & Co., p. 34 |
{{User:Aymatth2/citeQ |Q313030 |page=456}} | Edward Gibbon (1776), The History of the Decline and Fall of the Roman Empire, p. 456 |
The advantage would be complete and consistent source descriptions rendered from a single vetted Wikidata entry. The citations would be the same across all articles that use the source apart from page number. Error messages or hidden categories could be generated when the Wikipedia values did not match the Wikidata values, so they could be tracked down and corrected. I am sure there are all sorts of complexities: Books have different editions, journals get new publishers, articles are spread over multiple magazine editions, etc.. But is there any reason why we would not work towards implementing something like this? Or is it in the works already? Aymatth2 ( talk) 14:03, 3 October 2021 (UTC)
{{
cite Q}}
?
48Pills: in
this edit, alias of 'Lay summary'
is not correct. Please change it back to the actual parameter name. –
Jonesey95 (
talk) 05:24, 3 October 2021 (UTC)
We should just
deprecate and remove |lay-date=
, |lay-format=
, |lay-source=
, and |lay-url=
. I have marked these parameters as deprecated in the ~Whitelist/sandbox and will change our documentation to reflect that state.
Of course, now that I've done that, I expect that somebody's knickers will get in a twist and I'll all end up at some drama board. Those parameters are not amenable to replacement by bot because some human must decide if they are important to the en.wiki article and then create a separate cs1|2 template for those sources. Creating a maintenance category is possible but very, very few of us even know that maintenance categories exist so it will be years before the last |lay-<param>=
is removed (if ever).
— Trappist the monk ( talk) 12:16, 3 October 2021 (UTC)
When I use the issn= parameter a link to worldcat is automatically generated. Some ISSN values are valid and in portal.issn.org but not WorldCat. Example: https://portal.issn.org/resource/ISSN/2531-4661 vs. https://www.worldcat.org/issn/2531-4661 . Is there any way to control the automatically-generated link or just disable it? Thanks Jamplevia ( talk) 21:52, 30 October 2021 (UTC)
|issn=
is a low value parameter. There are other opinions expressed at
WP:ISSN. Apparently, portal.issn.org does not aid a reader of an en.wiki article in locating a copy of the periodical so, from that perspective, is of little use to our readers.}}
:
[https://portal.issn.org/resource/ISSN/2531-4661 ISSN 2531-4661 at issn.org]
→
ISSN 2531-4661 at issn.orgYou recently edited
Langbeinites a couple times to replaced UTF numeral sub/superscript characters with either {{
chem2}} or HTML <sup>...</sup>
or <sub>...</sub>
in the |title=
field in {{cite}}
templates. In both cases, this is not recommended because many fields of the various {{cite}}
templates generate COinS metadata, which is used for citation cross-compatibility on the Internet, beyond just Wikipedia. See
Template:Citation Style documentation/coins for {{cite}}
fields that are COinS-producing. —
sbb (
talk) 12:58, 15 June 2021 (UTC)
@
Beland: (I outdented my reply because some of the formatting I used doesn't like to be part of the wikitext :
indentation). Well, since COinS strings are emitted entirely as the value of the |title=
parameter in empty HTML <span></span>
tags, the only thing allowed in COinS strings is what can be in HTML attribute values. That's pretty much plain ASCII and URL-escaped entities. As an example, I created 3 references to a fake {{cite book}}
reference titled H2O and r2, using 3 different ways to markup the super- and subscripts (note also the that the r is italicized with wiki markup):
Generated COinS data
ref 1:
<span title="ctx_ver=Z39.88-2004& rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook& rft.genre=book& rft.btitle=H%3Csub%3E2%3C%2Fsub%3EO+and+r%3Csup%3E2%3C%2Fsup%3E& rft.date=2021& rft.au=sbb& rfr_id=info%3Asid%2Fen.wikipedia.org%3AUser%3ASbb%2Fsandbox" class="Z3988"> </span>
ref 2
<span title="ctx_ver=Z39.88-2004& rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook& rft.genre=book& rft.btitle=H%26%238322%3BO+and+r%26sup2%3B& rft.date=2021& rft.au=sbb& rfr_id=info%3Asid%2Fen.wikipedia.org%3AUser%3ASbb%2Fsandbox" class="Z3988"> </span>
ref 3
<span title="ctx_ver=Z39.88-2004& rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook& rft.genre=book& rft.btitle=H%E2%82%82O+and+r%C2%B2& rft.date=2021& rft.au=sbb& rfr_id=info%3Asid%2Fen.wikipedia.org%3AUser%3ASbb%2Fsandbox" class="Z3988"> </span>
Note that in ref1, the plain HTML <sub>2</sub>
and <sup>2</sup>
are URL-escaped, telling anybody who consumes/uses that COinS string that the book's title is "H<sub>2</sub>O ...
". It puts the constraint on the resource consumer to correctly parse HTML. Same situation with ref2, only instead of having to parse HTML <sub>
and <sup>
tags, they have to parse HTML entities. Still requires HTML parsing.
Only the last one, ref3, doesn't require HTML parsing, because the URL-escaped Unicode characters will be correctly interpreted.
Having said all that, note that wikitext is stripped from the data during Wikipedia's COinS generation. So no italicization, bolding, etc., get emitted into the COinS strings. This means that something like a title like, "Book about USS Iowa", will get interpreted as Book about USS Iowa
.
— sbb ( talk) 19:49, 15 July 2021 (UTC)
it looks like both HTML and Unicode subscripts go through the system intact [...]I wouldn't think of it that way. Per the OpenURL spec [1], "Recognizing the international environments in which ContextObjects will be used, the Committee selected Unicode as the abstract character repertoire for ContextObjects." The data is represented by Unicode, and encoded as UTF-8. An OpenURL parser is required understand Unicode, so a Unicode subscript character's representation is consistent. But parsers aren't required to then interpret the received Unicode string as partial HTML markup. So an HTML substring is just that: some characters in the ASCII-range that may or may not be HTML, and aren't required to be parsed as such.
Use of templates within the citation template is discouraged because many of these templates will add extraneous HTML or CSS that will be included raw in the metadata. Also, HTML entities, for example , –, etc., should not be used in parameters that contribute to the metadata.
|title=
, etc. fields. —
sbb (
talk) 21:04, 11 September 2021 (UTC)
Do not use the Unicode subscripts and superscripts ²and ³, or XML/HTML character entity references (² etc.).). I started that discussion several months ago, and it didn't gain much traction: Wikipedia talk:Manual of Style/Superscripts and subscripts § Add exception to allow Unicode super/subscripts in COinS fields in cite xxx templates? — sbb ( talk) 22:52, 11 September 2021 (UTC)
@ Trappist the monk: Greetings! To answer your question raised in this revert, Sbb started a thread at User talk:Beland#Use of Templates, HTML, and HTML entities within citation templates. I think that happened because I was going around changing articles (including citations) to conform with MOS:FRAC and Wikipedia:Manual of Style/Superscripts and subscripts, and the current guidelines result in HTML markup instead of Unicode precomposed fractions, superscripts, and subscripts. I couldn't find an authoritative COinS specification that explains how to handle superscripts, fractions (including those not available as precomposed characters), italics, and other markup in fields. I thought Sbb was advocating without opposition that Unicode characters be used instead of markup, and I was starting to change the guidelines to reflect that when we got your attention. Sbb also pointed out there has been opposition at Wikipedia talk:Manual of Style/Superscripts and subscripts. So, it would be good to discuss so I can get some clarification on what the consensus is here so I can update my spellcheck code and guideline pages if necessary. There are several possibilities for what to do:
<sup>...</sup>
etc.Thoughts? -- Beland ( talk) 02:59, 12 September 2021 (UTC)
title="..."
attribute of an empty HTML <span>...</span>
element that also has the attribute class="Z3988"
. HTML attributes cannot contain markup of any kind, so if it can't be sanitised to remove the markup, it must be omitted in the first place. --
Redrose64 🌹 (
talk) 07:22, 12 September 2021 (UTC)
%3Csub%3E
. It looks like other fields (like the URL of the page) also use percent-encoding, so downstream consumers would be expected to percent-decode out of course? The result of that decoding could be HTML or no-markup Unicode or MathML or whatever. --
Beland (
talk) 16:49, 12 September 2021 (UTC)<sup></sup>
this is easy enough to be parsed correctly even by humans, but most math stuff is more complicated.|descriptive-title=
in addition to the proper title |title=
and if the proper title is too complicated to use for metadata, pass down the descriptive title instead. --
Matthiaspaul (
talk) 09:34, 12 September 2021 (UTC)nowrap preventing the horrible line break between double quote and start of title, I haven't seen this yet. Can you provide an example? -- Matthiaspaul ( talk) 09:43, 12 September 2021 (UTC)
index entryyou mentioned, how would a work such as in David's example be represented in your classification databases? The question is in regard to the visual appearance as well as how it is encoded there. Is this something that can be derived from the proper title, or is it a descriptive title?
Two things. 1) No unicode characters. Those are a blight, and should be purged on sight. 2) Readers and accurate rendering of information are the priority. If COinS can't handle something, screw COinS. If magic codefu can be done to convert something non-COinS compliant to something COinS compliant behinds the scene (e.g. ''H''<sub>x</sub>20<sup>6</sup>
→ H_{x}20^{6}
or whatever the COinS standard is), great, but it should not require editors to sacrifice accurate rendering.
Headbomb {
t ·
c ·
p ·
b} 14:51, 12 September 2021 (UTC)
A few considerations and a question:
-- Beland ( talk) 17:29, 12 September 2021 (UTC)
{{
cite journal}}
: templatestyles stripmarker in |title=
at position 22 (
help) —
David Eppstein (
talk) 18:57, 12 September 2021 (UTC)
&rft.atitle=Theory+of+free%2C+spin-%7F%27%22%60UNIQ--templatestyles-0000001B-QINU%60%22%27%7F%3Cspan+class%3D%22frac%22+role%3D%22math%22%3E%3Cspan+class%3D%22num%22%3E1%3C%2Fspan%3E%26frasl%3B%3Cspan+class%3D%22den%22%3E2%3C%2Fspan%3E%3C%2Fspan%3E+tachyons
Theory of free, spin-<span class="frac" role="math"><span class="num">1</span>⁄<span class="den">2</span></span> tachyons
Theory of free, spin-<span><span>1</span>⁄<span>2</span></span> tachyons
Theory of free, spin-1⁄2 tachyons
alt=
attribute with PNGs, plain text with TeX, or the contents of <annotation>
elements with MathML), but this obviously doesn't cover all cases. It might be worth trying to further improve this, but we probably also need a |descriptive-title=
to allow editors to specify themselves what should be passed on as metadata.<math display=inline>210=14\times15=5\times6\times7=\binom{21}{2}=\binom{10}{4}</math>
<span class="mwe-math-element"><span class="mwe-math-mathml-inline mwe-math-mathml-a11y" style="display: none;"><math xmlns="http://www.w3.org/1998/Math/MathML" alttext="{\textstyle 210=14\times 15=5\times 6\times 7={\binom {21}{2}}={\binom {10}{4}}}">
<semantics>
<mrow class="MJX-TeXAtom-ORD">
<mstyle displaystyle="false" scriptlevel="0">
<mn>210</mn>
<mo>=</mo>
<mn>14</mn>
<mo>×<!-- × --></mo>
<mn>15</mn>
<mo>=</mo>
<mn>5</mn>
<mo>×<!-- × --></mo>
<mn>6</mn>
<mo>×<!-- × --></mo>
<mn>7</mn>
<mo>=</mo>
<mrow class="MJX-TeXAtom-ORD">
<mrow>
<mrow class="MJX-TeXAtom-OPEN">
<mo maxsize="1.2em" minsize="1.2em">(</mo>
</mrow>
<mfrac linethickness="0">
<mn>21</mn>
<mn>2</mn>
</mfrac>
<mrow class="MJX-TeXAtom-CLOSE">
<mo maxsize="1.2em" minsize="1.2em">)</mo>
</mrow>
</mrow>
</mrow>
<mo>=</mo>
<mrow class="MJX-TeXAtom-ORD">
<mrow>
<mrow class="MJX-TeXAtom-OPEN">
<mo maxsize="1.2em" minsize="1.2em">(</mo>
</mrow>
<mfrac linethickness="0">
<mn>10</mn>
<mn>4</mn>
</mfrac>
<mrow class="MJX-TeXAtom-CLOSE">
<mo maxsize="1.2em" minsize="1.2em">)</mo>
</mrow>
</mrow>
</mrow>
</mstyle>
</mrow>
<annotation encoding="application/x-tex">{\textstyle 210=14\times 15=5\times 6\times 7={\binom {21}{2}}={\binom {10}{4}}}</annotation>
</semantics>
</math></span><img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/4012a8a0261dae95c0a7443dbf67dcb58800df0c" class="mwe-math-fallback-image-inline" aria-hidden="true" style="vertical-align: -1.005ex; width:40.087ex; height:3.343ex;" alt="{\textstyle 210=14\times 15=5\times 6\times 7={\binom {21}{2}}={\binom {10}{4}}}"/>
<span class="mwe-math-fallback-source-inline tex" dir="ltr">$ {\textstyle 210=14\times 15=5\times 6\times 7={\binom {21}{2}}={\binom {10}{4}}} $</span>
<img src="https://wikimedia.org/api/rest_v1/media/math/render/png/4012a8a0261dae95c0a7443dbf67dcb58800df0c" class="mwe-math-fallback-image-inline" aria-hidden="true" style="vertical-align: -1.005ex; width:40.087ex; height:3.343ex;" alt="{\textstyle 210=14\times 15=5\times 6\times 7={\binom {21}{2}}={\binom {10}{4}}}" />
alt=
attribute; for LaTeX we took everything between the paired $...$
; for MathML we took the content of the <annotation>...</annotation>
tag.MATH+RENDER+ERROR
. Except for that, all of the rest of the metadata are correct:
<span ...>...
&rft.genre=article
&rft.jtitle=Publicationes+Mathematicae+Debrecen
&rft.atitle=MATH+RENDER+ERROR
&rft.volume=51
&rft.issue=1%E2%80%932
&rft.pages=175-189
&rft.date=1997
&rft_id=%2F%2Fwww.ams.org%2Fmathscinet-getitem%3Fmr%3D1468225%23id-name%3DMR
&rft.aulast=Pint%C3%A9r
&rft.aufirst=%C3%81kos
&rft.au=de+Weger%2C+Benjamin+M.+M.
</span>
&rft.atitle=
is dependent on the preference settings of the editor who last saved the article:
&rft.atitle=%3Cspan+class%3D%22nowrap%22%3E%7B%5Cdisplaystyle+210%3D14%5Ctimes+15%3D5%5Ctimes+6%5Ctimes+7%3D%7B%5Cbinom+%7B21%7D%7B2%7D%7D%3D%7B%5Cbinom+%7B10%7D%7B4%7D%7D%7D%3C%2Fspan%3E
&rft.atitle=%3Cspan+class%3D%22nowrap%22%3E210%3D14%5Ctimes+15%3D5%5Ctimes+6%5Ctimes+7%3D%7B%5Cbinom+%7B21%7D%7B2%7D%7D%3D%7B%5Cbinom+%7B10%7D%7B4%7D%7D%3C%2Fspan%3E
&rft.atitle=%3Cspan+class%3D%22nowrap%22%3EMATH+RENDER+ERROR%3C%2Fspan%3E
MATH+RENDER+ERROR
in the metadata. Alas, we cannot force editors to use PNG or LaTeX rendering, nor can we force MediaWiki to give us back the ability to extract content from math stripmarkers.|title=
is to have an alternate |math-title=
or some such that requires some sort of special-secret-markup that is not <math>...</math>
tags to wrap whatever would normally be in <math>...</math>
tags so, for example:
|math-title=A title with some text and $210=14\times15=5\times6\times7=\binom{21}{2}=\binom{10}{4}$ and yet more text
|math-title=
and then remove the special-secret-markup and put the result into the metadata. Then, the module would replace the special-secret-markup with actual opening and closing <math>...</math>
tags, and then preprocess a special template that renders the math title. That rendering then goes into |title=
. Yeah, pretty ugly, and I have no idea if it would work.|title=
with a <math>
tag (not all |title=
parameters are associated with cs1|2).|math-title=
that may, or may not, have $
delimited math text. If it finds a matched pair of $
delimiters, it replaces the delimiters with <math display=inline>
and </math>
and then preprocesses that string to get a math rendering that can be used in the citation's title:
{{#invoke:Sandbox/trappist_the_monk/math|math-title|math-title=$210=14\times15=5\times6\times7=\binom{21}{2}=\binom{10}{4}$}}
$210=14\times15=5\times6\times7=\binom{21}{2}=\binom{10}{4}$
|math-title=
might be used in the metadata as-is because the $
delimiters are 'native' to LaTex / TeX.\$
(literal '$' appearing in math text), support for '$' appearing in plain text that is not math text – for |math-title=
, requiring editors to escape '$' when it appears in text that is not math text seems a reasonable restriction for this parameter. No doubt there is other stuff to do with this hack before we consider implementing it in the cs1|2 module suite.|title=
for our local purposes.|text-title=
and my |descriptive-title=
are basically the same idea, except that in his, the contents of |text-title=
would completely replace the contents of |title=
for metadata purposes (similar to how your |math-title=
would replace |title=
for both, our local rendering as well as the metadata), whereas my |descriptive-title=
could be used instead of a normal |title=
(if not given), but could also be combined with |title=
(when both are given). The contents of the descriptive title should be displayed without
text decoration when rendered (not sure if in front or following the normal title if both exist), and should be put into [square-brackets] in metadata to indicate that this is not the original title (probably prefixing the normal title if both exist). The different representation styles would allow to tell them apart when both are displayed or combined into the single &rft.atitle=
or &rft.btitle=
COinS key.|descriptive-title=
would effectively become your |math-title=
when it contains some $TeX$. (And for the rare case, where the $TeX$ stuff should not be interpreted in your suggested way, we have our ((accept-this-as-written)) syntax to indicate this.) This way, the editor would have the flexibility to provide either the |title=
or the |descriptive-title=
(including its special handling for math), or both.|descriptive-title=
in the past, non existing titles, dynamic titles, visual or acoustical only titles, functional titles, alias titles, unrepresentable titles, because too long, in unsupported scripts, or misleading in our context...none
" to display the localized "no title"), and the case where a title does exists, but should not be displayed for some reason (keyword "off
"), for example in an article listing many revisions of a work), but where we would still want to issue the complete metadata for it. Last year, I started to implement this by introducing these keywords to |title=none/off
, but realized we would still need something more like a |descriptive-title=
parameter to specify the title for metadata.|descriptive-title=
is scope creep. What we are trying to solve is the display of math in titles. So we should limit it to such with a name that makes it obvious the purpose of that parameter.
Izno (
talk) 18:02, 15 September 2021 (UTC)
|title=
. It's still a possibility, but when we are now tinkering with the idea of introducing a dedidated |math-title=
, it is important to also think about more general descriptive titles. After all, a title for a textual math representation is some kind of descriptive title. Otherwise, we easily end up with a whole new bunch of special title parameters, something, I think, we both want to avoid. Therefore, it is a valid question how to possibly combine this at least in the design, even if not all parts of the actual solution would be implemented at the same time.Hmm, some sources when ASCIIfying article titles, appear to use TeX-like markup inside special markers like "##" or "$". Examples:
[1]
[2]. And here's an example of <em>...</em>
where we'd probably want to use '', but I think the em tag gets emitted in the final HTML:
[3]. --
Beland (
talk) 23:49, 13 September 2021 (UTC)
|text-title=
parameter to the template to use as the text version of the title, and simultaneously to allow templates like {{
frac}} or {{
nowrap}} or whatever in titles when a text-title is present. That wouldn't address the inability to extract meaningful text from <math> formulas, but I'm sure Citation bot could be persuaded to add text-titles for those. One reason it's a bad idea is that the parameter would only produce invisible markup and therefore there wouldn't be much incentive for editors to make it accurate. —
David Eppstein (
talk) 00:34, 14 September 2021 (UTC)
|title=A {{frac|1|2}} Title
A '"`UNIQ--templatestyles-00000069-QINU`"'<span class="frac"><span class="num">1</span>⁄<span class="den">2</span></span> Title
class="frac"
, class="num"
, and class="den"
are defined. None of that styling is available to readers who consume the citation through the metadata.
Module:Citation/CS1 might remove the stripmarker, all class=
attributes, and any <span>...</span>
tags without attributes:
A <span role="math">1⁄2</span> Title
style=
might be one, and remove other html tags.|title={{nowrap|don't wrap this text}}
<span class="nowrap">don't wrap this text</span>
nowrap
class is defined in
MediaWiki:Common.css. cs1|2 would include this in the metadata:
don't wrap this text
<em>...</em>
is inappropriate use where <i>...</i>
would have been a better choice. Apparently the $
is a standard part of LaTeX and TeX used to delimit the beginning and end of math text; using a standardized delimiter is always better than making up our own delimiters. I've changed my example above to use the $
delimiters.$
delimits in-line math text, which TeX renders in a smaller font size. --
Shmuel (Seymour J.) Metz Username:Chatul (
talk) 14:48, 14 September 2021 (UTC)
$
delimiters are appropriate, right?$ ... $
for inline math and $$ ... $$
for display math is old-school TeX markup. The modern alternative (better for being less ambiguous wrt actual dollar signs, and also with some technical advantages in actual TeX for making it easier to hang hooks in the code) is \( ... \)
for inline math and \[ ... \]
for display math. The Wikimedia developers have vetoed allowing these to be shortcuts for math markup in the Wikimedia codebase, but I suppose that doesn't prevent them from being used in templates that intercept them and convert them to <math display=inline> ... </math>
and <math display=block> ... </math>
respectively. Would this actually work? Can math tags in template output still be expanded, or is math tag expansion only done before the templates are expanded? If this could be done in the existing |title=
parameter, I think that would be better than introducing a new multiplicity of confusing variations of title parameters. —
David Eppstein (
talk) 18:32, 15 September 2021 (UTC)
$ ... $
. It's not a valid reason to avoid \( ... \)
because very few references use that syntax (very likely, zero references) and because if they do we can fall back to the format-as-typed escape codes already used elsewhere in the citation templates. —
David Eppstein (
talk) 22:09, 15 September 2021 (UTC)|math-title=
(or whatever) into the normal |title=
, as David suggests, this would be better from the user's perspective than to introduce a dedicated parameter for this. The question, however, is how conflictive such $TeX$ stuff would be within normal titles. If collisions would be rather rare, we still have our ((accept-this-as-written)) syntax to force the template to take the title verbatim (which is already supported by |title=
to override the removal of end interpunctation).|title=
parameter then the question is how at least text titles for math (which fall under the category of descriptive titles) can be combined with more general descriptive titles interfacewise, so that we eventually need only one new parameter rather than two for semantically close purposes.\( ... \)
delimiters:
{{#invoke:Sandbox/trappist_the_monk/math|math_test2|math-title=Entropy-Based Uncertainty Measures for \(L^2(\mathbb{R}^n),\ell^2(\mathbb{Z})\), and \(\ell^2(\mathbb{Z}/N\mathbb{Z})\) With a Hirschman Optimal Transform for \(\ell^2(\mathbb{Z}/N\mathbb{Z})\)}}
Entropy-Based Uncertainty Measures for \(L^2(\mathbb{R}^n),\ell^2(\mathbb{Z})\), and \(\ell^2(\mathbb{Z}/N\mathbb{Z})\) With a Hirschman Optimal Transform for \(\ell^2(\mathbb{Z}/N\mathbb{Z})\)
<math>...</math>
tags).<math>...</math>
tags in parameter values are expanded into math stripmarkers before cs1|2 gets parameter values. After cs1|2 has rendered the citation, MediaWiki replaces each math stripmarker with its associated expansion. Using $...$
or \( ... \)
instead of <math>...</math>
tags allows us to apply <math>...</math>
tags and then expand them into math stripmarkers (to be replaced by MediaWiki after cs1|2 final rendering) at the time of our choosing.|title=
is that we have to inspect every |title=
value for the \( ... \)
delimiters and it is possible that some title somewhere legitimately uses the TeX delimiters. Inspecting every |title=
value is relatively inexpensive because all we have to look for is the opening \(
delimiter so if Title:find ('\\%(') then ... end
– attempt to convert delimiters to <math>...</math>
tags only when a \(
delimiter is present. I found only
two instances of the opening \(
delimiter; one is vandalism and the other a malformed title. It would not be so simple with the $...$
delimiters so if we proceed with this solution and choose to use $...$
delimiters, implementing |math-title=
along-side |title=
is the better choice.\( ... \)
to mark TeX blocks, for as long as our (( ... ))
wrapping syntax would disable the feature.\( ... \) TeX delimiters experiment removed |
---|
|
warningmessages. If this change is accepted, I expect to remove the parts of Module:Citation/CS1/COinS that decoded the math stripmarker content – it won't be needed.
that decoded the math stripmarker contents, this would not affect the code for SVG and LaTeX math extraction, only for MathML, right?
<math>...</math>
markup in a |title=
parameter, and then publish that article, the live cs1|2 module will create the metadata string for that citation (coins_replace_math_stripmarker()
) using the math settings in my preferences because MediaWiki renders that math image into a stripmarker before cs1|2 gets the content of |title=
. Since the stripmarker was created using my settings, the metadata will be derived from my settings. The resulting metadata are then cached for everyone until some other editor saves the article and their math preference setting is different from mine.\( ... \)
TeX delimiters are introduced, as noted elsewhere in this discussion, an awb or some such script will be required to replace the <math>...</math>
markup which will cause an article refresh and so new metadata using the \( ... \)
TeX delimited wikitext straight from the appropriate parameter. Because we feed the metadata directly from the \( ... \)
TeX delimited wikitext, there is no need (and no ability to) decode a math stripmarker so the code that decoded the math stripmarker content (even if it still worked) will no longer be need so should be removed. If we ever need it, we can always get it back from a previous version of the module.<math>...</math>
markup at least for a while and it would have been convenient for them if they could continue to use it for entries which either do not contribute to the metadata, or to entries contributing to metadata, if they have selected SVG or LaTeX, not MathML. However, this would put the burden to switch to \( ... \)
on the next editor with MathML settings and would also leave the citation source code in a mix of markups, which might not be desirable for other parties which read our wikitext rather than metadata, so yes, I agree, a hard switch is probably the better approach here.a huge number of errors with existing citationsis an accurate description. If these search results are to be believed, there are:
<math>...</math>
tags to \( ... \)
TeX delimiters. That script can be run on the same day that the change goes live (if it goes live) and be done after a couple of hours.|script-=
parameters which become part of the metadata. Are there other parameters ending up in metadata where math-like constructs could show up occasionally? What about the journal name, work, author etc. names and publisher entries?|trans-=
parameters, and |quote=
, |script-quote=
and |trans-quote=
. Should we support the \(...\) syntax there as well as an alternative for syntax compatibility/consistency, or should we insist on <math>
there?\( ... \)
TeX delimiters for math markup. |journal=
? |work=
? |publisher=
? |author=
? I don't think so; at least not until a need has been sufficiently demonstrated. Quick searches for <math>...</math>
tags in those parameters either timed out with no results or returned no results. We should only support one form of math markup.\(...\)
TeX delimiters experiment.Given the new proposed solution for <math>...</math>
markup and the above comments, I'm wondering where we've come down on how to handle simple markup. I see contradictions between editors like "No unicode characters. Those are a blight, and should be purged on sight." vs. "exactly as the source information presents it, 'funny' characters and all". David Eppstein said titles should be formatted "the way the reference formatted it, even when our style guidelines would tell us to use a different style", but then used {{
frac}} instead of ½. Our style guide says to use {{
sfrac}} for science articles, so that seems to satisfy neither the goal of looking consistent with the style of body text nor the goal of being exactly the same as the original document for ease of search.
What are we proposing as the solution for simple markup, like a chemical formula? If we're following the sources exactly, we might use no-markup Unicode vs. <sup>...</sup>
depending on what the original document does, though if it's on paper or PDF it will be impossible to tell. If we're avoiding Unicode compatibility characters, then we still have at least three choices:
{{
cite book}}
: templatestyles stripmarker in |title=
at position 17 (
help) - {{chem2|H2O2}} - Copy-paste: Something about H 2O 2Though it's unclear to me how well any database or web search engine is going to handle the difference between say, "H2O2" as a search parameter and an internally stored "H<sub>2</sub>O<sub>2</sub>". -- Beland ( talk) 17:19, 18 September 2021 (UTC)
{{
chem2}}
; your example renders this mishmash:
'"`UNIQ--templatestyles-0000008D-QINU`"'<span class="chemf nowrap">H<sub class="template-chem2-sub">2</sub>O<sub class="template-chem2-sub">2</sub></span>
<br />
tags:
Something about H
2O
2.
<sub>...</sub>
tags and friends cleanly? A rule like "use HTML sup/sub tags instead of Unicode subscripts and superscripts" would be easy to follow and easy to enforce, so I'm thinking maybe do that for now?{{
frac}}
above at my 11:46, 14 September 2021 post.|title=Naked Gun 33{{sfrac|1|3}}
|title=Naked Gun 33'"`UNIQ--templatestyles-00000092-QINU`"'<span class="sfrac"><span class="tion"><span class="num">1</span><span class="sr-only">/</span><span class="den">3</span></span></span>
templatestyles
stripmarker refers to
Template:Sfrac/styles.css which defines the classes: sfrac
, tion
, num
, den
, and sr-only
. As with {{frac}}
, none of that styling is available to readers who consume the citation through the metadata so for them, the markup is just meaningless clutter.|descriptive-title=
(or |text-title=
per David) as a fallback, so that editors can use fancy stuff in |title=
for pretty local display purposes (without compromising for COinS), while still being able to exactly match a title, if known, as it may be used in an external database (regardless of what representation or transliteration may be used there) so that it can be used as search pattern there as well.filter/cleanup/simplifysolution may not be sufficient. In my sandbox I've hacked some code that removes:
<br />
tags (used in {{
chem2}}
)class=
attributes from <span>
tagsstyle=
attributes from <span>
tagstitle=
attributes from <span>
tags<span>
without attributes and its matching </span>
Naked Gun 33{{code|{{sfrac|1|3}}}}
Naked Gun 33'"`UNIQ--syntaxhighlight-00000099-QINU`"'
{{#invoke:Sandbox/trappist_the_monk/math|span_test|1=Naked Gun 33{{sfrac|1|3}}}}
Naked Gun 331/3
{{chem2|[{(\h{5}C5Me4)SiMe2(\h{1}NCMe3)}(PMe3)Sc(\m{2}H)]2}}
'"`UNIQ--templatestyles-000000A1-QINU`"'<span class="chemf nowrap">[{(η<sup>5</sup>-C<sub class="template-chem2-sub">5</sub>Me<sub class="template-chem2-sub">4</sub>)SiMe<sub class="template-chem2-sub">2</sub>(η<sup>1</sup>-NCMe<sub class="template-chem2-sub">3</sub>)}(PMe<sub class="template-chem2-sub">3</sub>)Sc(μ<sub>2</sub>-H)]<sub class="template-chem2-sub">2</sub></span>
{{#invoke:Sandbox/trappist_the_monk/math|span_test|1={{chem2|[{(\h{5}C5Me4)SiMe2(\h{1}NCMe3)}(PMe3)Sc(\m{2}H)]2}}}}
[{(η<sup>5</sup>-C<sub class="template-chem2-sub">5</sub>Me<sub class="template-chem2-sub">4</sub>)SiMe<sub class="template-chem2-sub">2</sub>(η<sup>1</sup>-NCMe<sub class="template-chem2-sub">3</sub>)}(PMe<sub class="template-chem2-sub">3</sub>)Sc(μ<sub>2</sub>-H)]<sub class="template-chem2-sub">2</sub>
{{chem2|C2H3O2(-)}}
'"`UNIQ--templatestyles-000000A9-QINU`"'<span class="chemf nowrap">C<sub class="template-chem2-sub">2</sub>H<sub class="template-chem2-sub">3</sub>O<span class="template-chem2-su"><span>−</span><span>2</span></span></span>
{{#invoke:Sandbox/trappist_the_monk/math|span_test|1={{chem2|C2H3O2(-)}}}}
C<sub class="template-chem2-sub">2</sub>H<sub class="template-chem2-sub">3</sub>O−2
 
when a <span role="math">
gets eliminated, and when the text before and after a <span>x<br/>y</span>
to be eliminated would be framed in <sub>y</sub>
and <sup>x</sup>
(perhaps only if inside a class="chemf"?).{{chem2|C2H3O2(-)}}
example a bit, and ideally, the stripped markup in that example should look just like the input. The the "2" subscript should bind closer to the "O" than the "-" charge. —
sbb (
talk) 01:22, 24 September 2021 (UTC)
{{
chem2/sandbox}}
input. Why anchor encoding? Before cs1|2 parameter values are added to the metadata, they are percent-encoded so, in its present form, what the metadata will get is:
[Cl4Re\qReCl4](2−)
← {{
chem2/sandbox|[Cl4Re\qReCl4](2−)}}
– anchor encoding
%26%2391%3BCl4ReqReCl4%26%2393%3B%282%E2%88%92%29
– percent encoding of the anchor-encoded input%5BCl4ReqReCl4%5D%282%E2%88%92%29
← [Cl4Re\qReCl4](2−) – percent encoding\q
is treated as an escaped q
so the result is missing the \
(%5C
). This particular {{chem2}}
input needs to be tweaked to escape the back slash:
%5BCl4Re%5CqReCl4%5D%282%E2%88%92%29
← [Cl4Re\\qReCl4](2−) – percent encoding{{chem2}}
accepts standardized so that consumers of the metadata will know what they mean when the metadata are decoded? If not then those symbols need to be replaced with the actual thing that they represent, don't they?MeTaDaTa-OuTpUt:
), but optionally also the (parameter) input (i.e. MeTaDaTa-InPuT:
). This might allow the extractor not only to replace the complete output of a template by its metadata (as in the current {{
chem2}} example) but allow metadata fragments to be inherited from internally called templates instead of having to handle everything monolithically on the level of the outer template (example: {{
chem}}, which internally uses {{
su}} - still thinking about the details...)are the input symbols that {{chem2}}
accepts standardized so that consumers of the metadata will know what they mean when the metadata are decoded?
I guess, this very much depends on the template, so even if this would be a standard notation in this particular case (I don't know if it is), it probably won't be in the general case. However, this is still a demo with the main purpose to illustrate how easy it would be to enhance templates in general. In a proper implementation, {{
chem2}} would probably not just forward its own input as metadata, but actually generate the metadata by processing the input (like it does for its normal output, but) in a form which would be text-only or use only very simple markup. What can be considered to be the best metadata very much depends on the purpose/function of the template. The advantage of this approach would be that the developers or users of the template probably know best what is the optimal text-only metadata that can be generated from the input (developers would program the template to generate the optimal metadata for the context the template is used in, and users would always be able to override it using the |metadata=
parameter), whereas the generic HTML simplifier in CS1/CS2 has no knowledge on the context and semantics and can only simplify based on universal structural rules. 
but ignores  
. I have added a workaround at least to the wrapper:
Module:DecodeEncode.decode.)title=
attribute. We could then use this instead of the actual HTML for metadata purposes (similar to what we do with math SVG and LaTeX extraction). Given that the HTML title= might be used by the template for other purposes already, and that it is also shown to users as tooltip (which might not be desirable if it contains stuff like "[{(\h{5}C5Me4)SiMe2(\h{1}NCMe3)}(PMe3)Sc(\m{2}H)]2
"), I am using the title= attribute only for illustration purposes here and we might find another HTML attribute or establish a special "
steganographic" notation where/how we could transparently hide those entries for possible extraction by CS1/CS2. Templates might even have a standardized optional parameter like |metadata=
to override what the template would otherwise use for this. Templates enhanced this way could get a sticker like "CS1/CS2-compatible" or such. Sure, this would work only for those templates which have been enhanced this way, but all we would have to do now is to specify a standard for this and implement a generic extraction mechanism which would take over whenever CS1/CS2 finds this special HTML attribute/notation in a citation's title. Over the years more and more templates could be adapted accordingly.<span class="MeTaDaTa::[{(\h{5}C5Me4)SiMe2(\h{1}NCMe3)}(PMe3)Sc(\m{2}H)]2">
normal_template_output
</span>
MeTaDaTa
" magic), it would replace the whole span including normal_template_output
(and, if present, also the corresponding stripmarker) by what follows the ::
following the MeTaDaTa
(which probably needs to be encoded in an actual implementation). For a template call like {{chem2|[{(\h{5}C5Me4)SiMe2(\h{1}NCMe3)}(PMe3)Sc(\m{2}H)]2}}
this would result in [{(\h{5}C5Me4)SiMe2(\h{1}NCMe3)}(PMe3)Sc(\m{2}H)]2
. Would the template be called like {{chem2|[{(\h{5}C5Me4)SiMe2(\h{1}NCMe3)}(PMe3)Sc(\m{2}H)]2|metadata=This is a text-only transcription of the chemical formula}}
instead, it would result in This is a text-only transcription of the chemical formula
. |metadata=off/none
would disable the metadata (nothing would be following the "::
" then). If the extractor does not find the triggering magic, or if the extracted data would be an empty string, it would proceed with the HTML simplification demoed above...{{frac/sandbox|1|2|3}}
'"`UNIQ--templatestyles-000000BB-QINU`"'<span class="frac MeTaDaTa::%E2%80%891%C2%A02%2F3" role="math">1<span class="sr-only">+</span><span class="num">2</span>⁄<span class="den">3</span></span>
{{frac/sandbox|1|2|3|metadata=Custom-Metadata}}
'"`UNIQ--templatestyles-000000C0-QINU`"'<span class="frac MeTaDaTa::Custom-Metadata" role="math">1<span class="sr-only">+</span><span class="num">2</span>⁄<span class="den">3</span></span>
{{frac/sandbox|1|2|3|metadata=off}}
'"`UNIQ--templatestyles-000000C4-QINU`"'<span class="frac MeTaDaTa::" role="math">1<span class="sr-only">+</span><span class="num">2</span>⁄<span class="den">3</span></span>
{{sfrac/sandbox|1|2|3}}
'"`UNIQ--templatestyles-000000C8-QINU`"'<span class="sfrac">1<span class="sr-only">+</span><span class="tion"><span class="num">2</span><span class="sr-only">/</span><span class="den">3</span></span></span>
{{sfrac/sandbox|1|2|3|metadata=Custom-Metadata}}
'"`UNIQ--templatestyles-000000CC-QINU`"'<span class="sfrac">1<span class="sr-only">+</span><span class="tion"><span class="num">2</span><span class="sr-only">/</span><span class="den">3</span></span></span>
{{sfrac/sandbox|1|2|3|metadata=off}}
'"`UNIQ--templatestyles-000000D0-QINU`"'<span class="sfrac">1<span class="sr-only">+</span><span class="tion"><span class="num">2</span><span class="sr-only">/</span><span class="den">3</span></span></span>
{{chem2/sandbox|[{(\h{5}C5Me4)SiMe2(\h{1}NCMe3)}(PMe3)Sc(\m{2}H)]2}}
'"`UNIQ--templatestyles-000000D4-QINU`"'<span class="chemf nowrap">[{(η<sup>5</sup>-C<sub class="template-chem2-sub">5</sub>Me<sub class="template-chem2-sub">4</sub>)SiMe<sub class="template-chem2-sub">2</sub>(η<sup>1</sup>-NCMe<sub class="template-chem2-sub">3</sub>)}(PMe<sub class="template-chem2-sub">3</sub>)Sc(μ<sub>2</sub>-H)]<sub class="template-chem2-sub">2</sub></span>
{{chem2/sandbox|C2H3O2(-)}}
'"`UNIQ--templatestyles-000000D8-QINU`"'<span class="chemf nowrap">C<sub class="template-chem2-sub">2</sub>H<sub class="template-chem2-sub">3</sub>O<span class="template-chem2-su"><span>−</span><span>2</span></span></span>
\(...\)
markup so that may go away. Removing certain html markup may or may not be adequate; I don't know, I'm not a chemist so I don't know if the resulting output to the metadata would be at all useful.\(...\)
at all, quite the contrary. He was concerned that issueing a visible error message would upset people, but that was before we discussed alternatives.Theory of free, spin-<span class="frac" role="math"><span class="num">1</span>⁄<span class="den">2</span></span> tachyons
Theory of free, spin-1⁄2 tachyons
<sup>
and <sub>
instead of <span>
s? The sup/sub can be styled for on-Wiki use, and stripped of class attributes that are meaningless to COinS consumers. —
sbb (
talk) 01:26, 24 September 2021 (UTC)
Are there any objections or comments on doing the following to get the ball rolling:
<sub>...</sub>
and <sup>...</sup>
instead of {{
chem}} and {{
chem2}} for chemistry formulas in citations.?
I don't often see more complicated math in citations, but would we want to make a {{
citemath}} that uses <math>...</math>
for now and can be switched over to \(...\) when that change is ready to be deployed? (And then easily changed again later if handling of TeX-like math formulas needs to change.) --
Beland (
talk) 01:19, 23 September 2021 (UTC)
|descriptive-title=
(which would also be useful for many other purposes, as mentioned further above). With this in place, we may need only a few general recommendations how to provide titles instead of having to address this explicitly in the MOS.{{frac|1|2|3}}
|title=
, prompting the current implementation to throw a stripmarker error message:
'"`UNIQ--templatestyles-000000DD-QINU`"'<span class="frac">1<span class="sr-only">+</span><span class="num">2</span>⁄<span class="den">3</span></span>
1+2⁄3
<span role="math">1+2⁄3</span>
1+2⁄3
|descriptive-title=
). Assuming the COinS consuming entity would be able to process HTML, a HTML engine at their end would make this out of the simplified HTML:
1 2/3
existing Unicode superscripts and subscripts, while I agree that the HTML sub- and superscripts look nicer if used in formulas and are generally to be preferred, in non-scientific articles an occasionally interspersed Unicode super- or subscript character in citation titles might not be a bad idea at all. At least they are COinS-safe out of the box and neither require a HTML engine at the receiver's end nor a TeX-savy human to be decoded. I would not use them in technical articles, but also would not want to ban them in non-technical articles. So, it all depends on the context IMO. What does this mean in regard to MOS or more-citation related guidelines? We could offer some generally recommended best practises there, but we should not rule out any of the possible formats in general. And what does that mean in regard to CS1/CS2? We will have to cope with whatever editors throw at us, therefore we probably need all, special \( ... \) markup, HTML simplifier, template internal metadata, and
|descriptive-title=
to cope with all possible cases optimally."hey I'll have that code done in the next couple weeks", the next CS1/CS2 update isn't scheduled yet but I guess it could be in mid-October.
|descriptive-title=
), would allow us to address all aspects of the problem in the best-possible way without putting restrictions on users which templates or math markup they can use in citations, so that they can use what is best (based on their editorial capabilities) to produce the desired nice-looking output in rendered citations, but still would produce (or at least allow to produce) perfectly simplified and semantically correct metadata at the same time.I tweaked this citation today and introduced an extraneous =
character in |newspaper==Duluth News Tribune
. But, I did not see it so the article got published with my error. I expect that I'm not the only one to have done that. So, I've tweaked the extraneous punctuation test:
Wikitext | {{cite news
|
---|---|
Live |
"Lynn Diane (Swapinski) Jurek". Obituaries. =Duluth News Tribune. Duluth, MN. Archived from
the original on 2013-01-21.{{
cite news}} : CS1 maint: extra punctuation (
link)
|
Sandbox |
"Lynn Diane (Swapinski) Jurek". Obituaries. =Duluth News Tribune. Duluth, MN. Archived from
the original on 2013-01-21.{{
cite news}} : CS1 maint: extra punctuation (
link)
|
Extraneous punctuation is not considered an error so the article ends up in Category:CS1 maint: extra punctuation and cs1|2 displays the green maintenance message for those few who have enabled maintenance messaging.
— Trappist the monk ( talk) 18:58, 1 November 2021 (UTC)
I've just formatted a reference to RFC 9134 and have been advised to report the "check |rfc= value" error here. Range checking apparently needs to be updated. ~ Kvng ( talk) 17:33, 2 November 2021 (UTC)
I've made a piece of software called PressPass (
a longer explanation of its features and functions is here, along with the code). Essentially, what it does is automatically generate filled-out {{
cite news}} invocations from
Newspapers.com clippings and search pages. Currently, I am revising some parts of the generation functions, and making a configuration menu (for stuff like, e.g., whether to include access-date
). However, I would like to ensure that the templates it generates are properly formatted.
Here is what it looks like, for this clipping:
<ref name="Charle18031112">{{Cite newspaper|url=https://www.newspapers.com/clip/87466966/public-auction/|date=1803-11-12|page=4|title=Public Auction|newspaper=The Charleston Daily Courier|location=Charleston, South Carolina}}</ref><!-- Sat -->
So far, in the configuration menu, I'm writing features to allow multi-line cite templates, as well as different options for the date output (1969-12-31, 31-12-1969, 1969 Dec 31, 1969 December 31, December 31, 1969"), and the ability to specify whether access-date
, via
, or location
are included.
This is, more or less, all the information exposed to my script from the clipping page. The headline has to be typed in manually by the user since Newspapers.com doesn't have this all scraped. That said, I understand that there's a lot of "best practices" with regard to the ordering of parameters, et cetera (and I can have the date output as whatever). Since I expect this to be used a lot (and have been using it a lot myself, for example in improving Bradford Island to FA), is there anything I should be doing? Is there anything I'm missing? jp× g 20:43, 21 October 2021 (UTC)
|title=
, so please ensure that your tool requires it. If you are adding |access-date=
, remember that it requires |url=
. Some of those date formats are invalid on Wikipedia and in CS1 templates. See
MOS:DATE for valid formats. –
Jonesey95 (
talk) 20:58, 21 October 2021 (UTC)
<ref>
tag name=
attribute and hyphenate the date; consider including the page number and provision for a disambiguator for the cases where multiple articles sharing the same page are cited (<ref name="Charleston 1803-11-12 p4a">
).|via=Newspapers.com
because that is the recommended form at
WP:Newspapers.com and because the clipping is not delivered by the publisher.|location=
should not be displayed except when it is needed to disamblguate the newspaper named in |newspaper=
.|access-date=
is generally not required (the newspaper is not an ephemeral source).|pages=[https://www.newspapers.com/clip/56035464/the-los-angeles-times/ 1], [https://www.newspapers.com/clip/56035546/the-los-angeles-times/ 10]
for as many pages as are necessary; allow for page ranges{{
Cite newspaper}}
to {{
Cite news}}
or maybe {{
Cite web}}
. Some discussions
here and
here. I also don't see the need to note the day of the week, nor do I understand why it's an HTML comment. And regarding date formats: extremely cool, albeit not to be expected, would be if PressPass tried to automatically use the format specified in the page's {{Use xxx dates}}
template, if present. Otherwise, what Jonesey said (22:35). —
JohnFromPinckney (
talk /
edits) 00:21, 22 October 2021 (UTC)
|location=
should not be displayed except when it is needed to disamblguate the newspaper named in |newspaper=
".|location=
is ESSENTIAL unless the name of the city of publication is part of the name of the newspaper". --
Alarics (
talk) 09:33, 22 October 2021 (UTC)We do not permit the use of the YYYY-MM-DD format for dates in the Julian calendar. The newspaper.com site provides coverage for some newspapers that were published before 1752, the year in which the British colonies in North America changed from the Gregorian calendar to the Julian calendar. An example of a newspaper available from newspapers.com for this situation is The Pennsylvania Gazette for the year range 1728 to 1752. The metadata emitted by Citation Style 1 is false for Julian calendar dates. In this situation, I suggest you emit a plain text citation with no template. For example,
[This example illustrates another failing of Citation Style 1. It lacks the ability to use a description as a title, for cases when the author or publisher have not given a story a title.]
Jc3s5h ( talk) 14:51, 22 October 2021 (UTC)
FWIW, saving your clipping into Zotero, then dragging it here from there, using Zotero's Wikipedia exporter function, produces:
{{Cite news| pages = 4| title = Public Auction| work = The Charleston Daily Courier| location = Charleston, South Carolina| accessdate = 2021-10-25| date = 1803-11-12| url = https://www.newspapers.com/clip/87466966/public-auction/}}
While any volunteer is, of course free within reason to work on anything they like, perhaps effort would be better deploying in tweaking that exporter, rather than (if I may) reinventing the wheel? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:53, 25 October 2021 (UTC)
I've made a lot of improvements to the script, and a new version is here; thanks to everyone who helped me out in this discussion! The documentation page covers all of its behavior: let me know if there is anything I missed. jp× g 21:46, 7 November 2021 (UTC)
|author=
& |agency=
. These are not so much discovery parameters, although they may help in finding the correct source faster, but they have a reliability component. An otherwise unreliable source (such as a newspaper that is an official or semi-official organ of a political party, or is state-controlled) may carry a press report from a press agency with proven prior reliability. Or may take through syndication, the column of an author with a history of past objectivity. Such treatment may provide the citation with elevated reliability. In any case, |author=
should probably be included even when there is no byline. This is recommended: |author=<!--Staff writer(s); no byline.-->
.
172.254.222.178 (
talk) 23:17, 7 November 2021 (UTC)|author=<!--Not stated-->
; see
Help:Citation Style 1 § Authors.I wanted to add an example of the use of ref=none. This is the usual form, so I felt an example of its use was appropriate. Unfortunately, there is no talk page for discussion of this, as it links to here Hawkeye7 (discuss) 21:11, 9 November 2021 (UTC)
|ref=none
with the edit summary Example text. I reverted that because
live documentation is not for testing. Editor Hawkeye7 then added new
{{
markup2}}
template with the edit summary NOT a test edit - ref=none needs to be used all the time - documentation needs to show example of this. I reverted again noting that the initial edit was labeled as a 'test' and suggested discussion here.
|ref=none
is noted in the template's
documentation which links to a longer discussion at {{
citation}}
. I'm not sure that yet-another-example at {{
cite encyclopedia}}
is all that beneficial.Template:Cite book/doc states:
Afterword
, Foreword
, Introduction
, or Preface
will display unquoted; any other value will display in quotation marks. The author of the contribution is given in contributor.However, I cannot manage to get any of Afterword
, Foreword
, Introduction
, or Preface
to work. Could someone help?
Veverve (
talk) 13:12, 13 November 2021 (UTC)
{{cite book |contributor=Foreword author |contribution=Foreword |author=Book author |title=Title of the book}}
Can we create an equivalent of Template:Page numbers for video timestamps? I would like to be able to cite minute #, second # of a video, as a reference for a statement. I believe this feature and a "Video timestamps needed" tag would improve the WP:V of videos used as references, of which there are many. LondonIP ( talk) 23:44, 13 November 2021 (UTC)
{{cite AV media|url=https://www.youtube.com/watch?v=ctoF5Ctc0ZM|title=Day at Night: Muhammad Ali, legendary boxing champion|time=21:50}}
The Egyptologist Dorothea Arnold doesn't have an article here, but does in three other Wikipedias. Unsurprisingly,
doesn't work. Is there a workaround (or am I missing something obvious)? -- Hoary ( talk) 01:23, 14 November 2021 (UTC)
{{Cite book |last=Arnold |first=Dorothea |author-link=:d:Special:EntityPage/Q1246153#sitelinks-wikipedia |last2=Allen |first2=James P. |last3=Green |first3=L. |url=https://books.google.com/books?id=sGLFwVkljQMC&pg=PA135 |title=The Royal Women of Amarna: Images of Beauty from Ancient Egypt |date=1996 |publisher=Metropolitan Museum of Art |isbn=978-0-87099-816-4 |language=en}}
Template:Cite web/Danish is listed as auto subst but has 165 transclusions. Any idea why it isn't being subst? Gonnym ( talk) 20:10, 15 November 2021 (UTC)
{{
Kilde www}}
a redirect to {{
cite web/Danish}}
? Does
User:AnomieBOT/source/tasks/TemplateSubster.pm understand redirects? I have no experience with Perl so it isn't clear to me from that code. Perhaps Editor
Anomie can provide an answer.<ref>
s (but it won't subst inside <nowiki>
, which makes {{
row numbers}} annoying). You're right that having over 100 transclusions will prevent substing, as a safety measure against some vandal finding a template with thousands of transclusions and having the bot subst them all. 100 seems a decent cutoff for someone to manually fix should a bad substing run need to be reverted.
Anomie
⚔ 22:02, 15 November 2021 (UTC)
|date=
parameters, if anyone is interested in troubleshooting. The code is too nested and strange for me to parse and does not catch invalid dates. –
Jonesey95 (
talk) 23:30, 15 November 2021 (UTC)
I was following this discussion with interest, but I see now it's been archived with nothing being done? Given the unanimous consensus there to make the proposed changes to the "volume" output, and since no-one objected to Kanguole's sandbox edits, is there any reason this can't be implemented right away? Dan from A.P. ( talk) 10:04, 21 November 2021 (UTC)
I have added 'author', 'collaborator', 'contributor', 'editor', 'translator' as bogus names. Yeah, this will break the convenient |author=Author
|editor=Editor
etc demo uses but these exist in article-space templates where they do not belong. We will just have to be a little more creative in our demos.
— Trappist the monk ( talk) 16:22, 22 November 2021 (UTC)
|author=XYZ, Inc.
replaced by |last=XYZ
|first=Inc
. Then there are authors who are individual people, but whose names do not follow the Western "forename surname" convention, so I might use |author=Ban Ki-moon
for which both |last=Ban
|first=Ki-moon
and |first=Ban
|last=Ki-moon
are just plain wrong. This has been discussed before. --
Redrose64 🌹 (
talk) 22:40, 22 November 2021 (UTC)
{{cite book/new |author=XYZ, Inc. |title=Title}}
{{cite book/new |author=Ban Ki-moon |title=Title}}
I am not sure if here is the correct location for my question so please forgive my trespass. I read myself around in circles a while to see if this had been discussed before and the only thing I found that was close is THIS which was about display of titles in italics vs quotes. It is why I felt this best posted here since it is of a similar type subject.
These are very different documents in scope and purpose and it feels erroneous that the difference isn't displayed accurately in the citations. Has there already been a discussion/consensus on this issue that I missed, or is it possibly a simple oversight and easy fix?
With best regards.
---
Darryl.P.Pike (
talk) 20:40, 23 November 2021 (UTC)
|type=
to display something other than the default "Thesis". –
Jonesey95 (
talk) 21:01, 23 November 2021 (UTC)
{{
Cite thesis/doc}}, in the Template Data section, shows an error, "|degree=
is not a valid parameter". I think that error message, which appears to be caused by
Module:Cs1 documentation support, is incorrect. |degree=
appears to be a documented and working alias of |type=
. It looks like something needs to be tweaked. –
Jonesey95 (
talk) 21:06, 23 November 2021 (UTC)
|degree=
is defined as a parameter that is unique to {{
cite thesis}}
|degree=
does not have any aliases|type=
as an alias of |degree=
; it is not, |degree=
modifies the template-specific default TitleType
metaparameter unless overridden by |type=
|type=
does not have |degree=
as an aliasI find it's distracting and a bit confusing about the place of publisher in journal article citation. For example:
I think journal volume and issue should go immediately right after journal name, instead of publisher inserted between them. Like, the above example should read:
This would make it more logical and smooth to read. Can someone please explain why we put the publisher in the place like we're doing? Any specific citation rule that I'm not aware of? Thanks a lot. Sorry for my bad English. 2604:3D08:4E7F:F7E0:952C:82DC:1A4F:9CD1 ( talk) 16:51, 28 November 2021 (UTC)
|publisher=
in {{
cite journal}}
. Yeah, the documentation sucks. But at
Help:Citation Style 1 § Work and publisher under the Publisher bullet is this:
Wikipedia:Village pump (proposals)#rfc: shall we update cs1/2?
— Trappist the monk ( talk) 23:13, 28 November 2021 (UTC)
Does someone know how to edit the TemplateData for this? It should be something like the following. Thanks.
"suggestedvalues": [ "live", "dead", "unfit" ]
— Michael Z. 01:07, 29 November 2021 (UTC)
{{
cite book}}
appears to have something like what you are suggesting; see
Template:Cite book/TemplateData{{
cite web}}
in the visual editor, and the field has a dropdown, but it’s not populated with values (if I enter a valid one, then only it appears). Clicking on the template's documentation brought me here. —
Michael
Z. 01:38, 29 November 2021 (UTC)
{{
cite web}}
and {{
cite book}}
appear to have more-or-less the same "suggestedvalues"
values (order is different); I doubt that order makes a difference. There are other obvious differences where one template has something that the other template does not: {{cite book}}
has "default"
; {{cite web}}
has "autovalue"
and "suggested"
. I suppose that these might make a difference but I don't know. Have you tried an experiment using the {{cite book}}
template instead of {{cite web}}
?Is there a reason why double quotation marks in such cases are not automatically displayed as single quotation marks? E.g.
{{cite web |title=Title with "quotation marks" in it |url=http://www.example.com/}}
Cheers – Finnusertop ( talk ⋅ contribs) 08:29, 29 November 2021 (UTC)
|title=
includes at least one double quote mark so the search includes templates like {{
cite book}}
that don't wrap |title=
in quote marks and ignores |chapter=
(and aliases) parameters that do.There's always been a weird behaviour when |collaboration=
is set. E.g.
{{Cite book |last1 = Van Dijk |first1 = Peter Paul |last2 = Iverson |first2 = John |last3 = Shaffer |first3 = H. Bradley |last4 = Bour |first4 = Roger |last5 = Rhodin |first5 = Anders |collaboration=Turtle Taxonomy Working Group |year = 2012 |chapter = Turtles of the World, 2012 Update: Annotated Checklist of Taxonomy, Synonymy, Distribution, and Conservation Status |title = Conservation Biology of Freshwater Turtles and Tortoises |doi = 10.3854/crm.5.000.checklist.v5.2012 |isbn = 978-0965354097}}
gives
but it should instead give
Et al should not be applied automatically when collaborations are set. Headbomb { t · c · p · b} 06:09, 4 December 2021 (UTC)
The majority of cases would have the et al. though. Headbomb {talk / contribs / physics / books} 02:16, 26 December 2015 (UTC)), and we're now some 6 years down the road. Has your assessment given then changed for some reason? Do you want to track down and add it to all the citations that rely on the current behavior? Izno ( talk) 07:38, 4 December 2021 (UTC)
|collaboration=
which rely on an automatic et al?
Izno (
talk) 08:14, 4 December 2021 (UTC)
|collaboration=
which inappropriately adds et al? The vast majority of uses requiring an et al. to be displayed already have a manually set display-authors. The vast majority of uses which don't have a display authors shouldn't have the et al. There are very, very few citations with a collaboration parameter set that need an automatic et al to be added.
Headbomb {
t ·
c ·
p ·
b} 15:57, 4 December 2021 (UTC)
|display-authors=4
works properly, but |display-authors=5
returns an error. This should be handled at the source. I suppose when/if development on the module collection resumes, this could be tasked, following discussion.
65.88.88.71 (
talk) 16:47, 4 December 2021 (UTC)|display-authors=
cannot be set manually for authors>4, or remove default-value rendering.
65.88.88.71 (
talk) 16:59, 4 December 2021 (UTC)
four authorsdefault?
|collaboration=
does not count author names. The
documentation does say that 'et al.' will be appended to the author-name list when |collaboration=
is used. If you believe that the documentation can be improved, please do so.{{Cite book |last1 = Van Dijk |first1 = Peter Paul |last2 = Iverson |first2 = John |last3 = Shaffer |first3 = H. Bradley |last4 = Bour |first4 = Roger |last5 = Rhodin |first5 = Anders |collaboration=Turtle Taxonomy Working Group |year = 2012 |chapter = Turtles of the World, 2012 Update: Annotated Checklist of Taxonomy, Synonymy, Distribution, and Conservation Status |title = Conservation Biology of Freshwater Turtles and Tortoises |doi = 10.3854/crm.5.000.checklist.v5.2012 |isbn = 978-0965354097|display-authors=4}}
{{Cite book |last1 = Van Dijk |first1 = Peter Paul |last2 = Iverson |first2 = John |last3 = Shaffer |first3 = H. Bradley |last4 = Bour |first4 = Roger |last5 = Rhodin |first5 = Anders |collaboration=Turtle Taxonomy Working Group |year = 2012 |chapter = Turtles of the World, 2012 Update: Annotated Checklist of Taxonomy, Synonymy, Distribution, and Conservation Status |title = Conservation Biology of Freshwater Turtles and Tortoises |doi = 10.3854/crm.5.000.checklist.v5.2012 |isbn = 978-0965354097|display-authors=5}}
{{
cite book}}
: Invalid |display-authors=5
(
help)|display-authors=
and |collaboration=
.
65.88.88.69 (
talk) 20:28, 4 December 2021 (UTC)|display-authors=etal
is a de facto default when |collaboration=some value
.
65.88.88.69 (
talk) 21:37, 4 December 2021 (UTC)
{{
citation/core}}
, there was provision for no more than 9 authors, but the ninth (if supplied) was never displayed. By default, the first 8 were always displayed, and if you supplied either |last9=
or |author9=
, the first 8 would be followed by "et al". This cut-off could be adjusted by means of the |display-authors=
parameter, which accepted an integer in the range 1-8, so you could show fewer than 8 (but not more) before the "et al". Regardless of the number actually displayed, all 9 would be put into the COinS. All this changed in 2013 when
Module:Citation/CS1 was introduced. --
Redrose64 🌹 (
talk) 23:27, 4 December 2021 (UTC)|collaboration=
in that citation? My gut reaction is that the collaboration's name should be included and the list of individual names truncated to one or a few primary names because I suspect (without any evidence to support this) that it will be easier for a reader to locate the source by primary authors and the name of the collaboration than by the names of all n author names (when n is a relatively large number).|collaboration=
without any |author=
, |last=
, or |vauthors=
is silently ignored:
{{cite book |collaboration=The Writers Group |title=Title}}
→ Title.|collaboration=
requires at least one author name so templates without that name should declare a missing-name error.|display-authors=etal
is the default. It should not be so, per
POLA.
Headbomb {
t ·
c ·
p ·
b} 19:04, 4 December 2021 (UTC)
A commitment to fixing uses after a change can help ensure the change gets made in the first place.
|display-authors=etal
is the default. It should not be so.
65.88.88.71 (
talk) 15:18, 4 December 2021 (UTC)Neither of us have any way of detecting problem cases.is true, but that also means we have no data to support any suggested implementation. I do agree that the current default, "et al", probably reflects more citations to collaborations than the "we named all named authors so we don't need the et al" case.
|display-authors=
to be set to some reasonable value/key to indicate that no et al should be displayed from the current default.|authorn=et al.
as the last author in an arbitrarily shortened author list? Or is ""et al." a bogus name? Such citation may be convenient but it is not correct. The correct form is to give all authors as they appear on the source (subject to module limits) and then optionally truncate the list by using |display-authors=
. This way both verifiability and attribution ate satisfied. Any other concern (additional text, more cumbersome editing, etc) comes in a distant last.
65.204.10.232 (
talk) 15:11, 5 December 2021 (UTC)
|collaboration=
and all authors in the collaboration are also provided in the citation of interest.anything stopping an editor from eschewing display-authors and addingYes, an error displays: Last1; et al. Title.
{{
cite book}}
: Explicit use of et al. in: |last2=
(
help)CS1 maint: numeric names: authors list (
link), subsequently cleaned up by gnomes.Note, for example,
Template:Cite Blomefield, which throws a "Check date values in: |publication-date=" error, but the c. works fine for the |year=
parameter. Could the cite book template be modified so "c." is valid for |publication-date=
? =
paul2520
💬 19:13, 5 December 2021 (UTC)
{{
Cite Blomefield}}
consider removing the 'when written' date (currently in |year=
) and replacing |publication-date=
with |date=
. The 'when written' date doesn't really aid a reader in locating a copy of the source and may, in fact, cause some confusion because the 'when written' date appears first in the rendered citation. There has been some discussion here about changing
Module:Citation/CS1 so that |publication-date=
becomes a complete alias of |date=
.This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 75 | ← | Archive 78 | Archive 79 | Archive 80 | Archive 81 | Archive 82 | → | Archive 85 |
A recent edit at Dave Grohl has produced "Lua error in Module:TwitterSnowflake at line 16: attempt to perform arithmetic on local 'c' (a string value)." That is seen by previewing the following.
{{cite tweet |author=Foo Fighters |title=Example |user=foofighters |number=1026546600946982912/video/1 |access-date=August 10, 2018 }}
The problem is due to "/video/1" in the number parameter and is easily fixed, but perhaps the module could show that number is invalid. Johnuniq ( talk) 02:43, 25 October 2021 (UTC)
There are usurped URLs, we also have usurped titles. See
here 38 domains have been usurped by a gambling site, then ReFill or Citation bot add a missing |title=
pulled from the gambling site - d'oh. My bot can deal with the usurped URLs but what about the titles: delete the title, or replacing with a place holder? If it is deleted, it won't stop other tools from re-adding the usurped title again. We could notify tool makers, but there is no guarantee they will implement, or future tools will be created, there is also
global wikis with the same problem. IMO a placeholder title (eg. |title=Usurped title
) that enters a tracking category with a help message would lock up the title field from usurpation until someone can manually add a working title (if one can be found). --
Green
C 19:18, 20 October 2021 (UTC)
|url-status=usurped
- but it does not determine a working title. It can't just leave a spam usurped title, it needs to do something. The question is: What? --
Green
C 21:41, 20 October 2021 (UTC)
|title=
(My bot can deal with the usurped URLs but what about the titles). But here you are saying that your bot
can't just leave a spam usurped title, it needs to do something. But if your bot
does not determine a working titlehow can it know that the title in
|title=
is a spam usurped title? Too many 'buts'.
|title=Archived copy
; only a handful of gnomes who have enabled maintenance messaging see the maintenance messages associated with
Category:CS1 maint: archived copy as title (54,351).|title=Usurped title
. The keywords in the title is actually how the 38+ domains where first identified as being usurped. --
Green
C 01:49, 21 October 2021 (UTC)
Usurped title
(case insensitive) to the list of generic/bogus titles in the sandbox:
My bot adds archive URLs and toggles |url-status=usurped
- but it does not determine a working title.
Setting |url-status=usurped
masks the original (usurped) |url=
in the citation's rendering when |archive-url=
has a value:
|title=
value but can recognize a usurped title and replace that title with a value that will cause cs1|2 to emit an error message and category, a human or a 'title-finding' bot can make a repair.|url-status=usurped
? I can hard set on enwiki fairly soon, on other wikis it could be a long time if ever. --
Green
C 16:34, 21 October 2021 (UTC):
|url=
values? Isn't that what edit filters are supposed to be? Of course, if these urls get added to an edit filter, all of a sudden editors won't be able to publish other needed article changes until they do something about the blacklisted url. I don't have a lot of experience with edit filters, but I recall being stymied when I couldn't discover from the message just which one of dozens or more urls was the one that prevented publishing the article. Does anyone know if that has improved and the can't-publish-this-page-because-it-has-a-banned-url error message contains a clue about which url(s) triggered the edit filter? If there has been that improvement, then cs1|2 should, I think, stay out of it and let the edit filters do their jobs.|url-status=usurped
. The problem with 2 globally is no bot exists for the indefinite future. It's actually quite difficult to usurper a domain: add archive URLs, flip the url-status, undo {{
webarchive}}
and convert to straight archive, remove entirely citations that have no archive, convert bare/square links to archive. To do it globally is a major undertaking due to the language and template differences. It would help, though be incomplete, if CS1|2 could detect from a list of domains and display as-if |url-status=usurped
. --
Green
C 05:15, 22 October 2021 (UTC)
|url-status=usurped
when such a url is detected. Setting |url-status=usurped
does nothing when |archive-url=
is empty or missing so special code would need to be written to suppress the url when the internal |url-status=usurped
is set. I fear that once added to a blacklist, urls will never be deleted so the list will grow until sometime down the road the system collapses because the size of the blacklist will push some articles over the lua memory-use or execution-time limits.|url-status=usurped
or adding an archive URL - the edit filter blocks the corrective. They are left to delete the cite or URL. Since every domain dies eventually , and some percentage of those will be hijacked, long-term it is a problem so trying to explore other solutions. Bots are the best answer, also the hardest. --
Green
C 05:16, 23 October 2021 (UTC)Pseudocode proposal (usurped routine):
Is url-status=usurped?
Y
Is archive-url=[non-empty]?
Y
url=archive-url
via=archive service name
Is title usurped?
Y
title=[original title from archive]
exit (usurped routine)
N (archive-url=empty)
Is cite web?
N
Is title usurped?
N
hide span url url-access access-date format via
exit (usurped routine)
Y (cite web)
flag delete/replace
hide cite span
exit
104.247.55.106 ( talk) 01:55, 23 October 2021 (UTC)
Is anyone aware of a Zotero plugin for Firefox?
I'm aware that Zotero allows for exporting of references into CS1, but it does so by producing a text file with the code. I was hoping there might be some plugin that let me insert it in browser, much like you can insert Zotero references in MS Word.
Cheers. Kylesenior ( talk) 05:25, 24 October 2021 (UTC)
Case citations work pretty much like {{ cite book}}. I've linked an example of what I'm proposing above which was done in my sandbox. You can see the result right here. Obviously, things can be expanded later as far as features are concerned, but for now I would like to get thoughts on maybe adding this to the primary CS1-suite of templates. – MJL ‐Talk‐ ☖ 01:04, 20 October 2021 (UTC)
{{
cite case}}
template, create a wrapper template around {{
cite book}}
. Using
Module:template wrapper make all {{cite book}}
parameters available and allows you to preset |type=Court case
and any other parameters that should hold default or calculated values.|vol=
get held over as an alias. This would make {{cite court}} the only CS1/CS2-style citation template that has that alias. That is not even getting into the fact |lang=
, |via=
, |url-status=
, |archive-url=
, |archive-date=
, |pinpoint=
are all inconsistent with all other citation templates. –
MJL
‐Talk‐
☖ 15:13, 23 October 2021 (UTC)
|vol=
and all of its current parameters, add support for parameters like |archive-url=
very easily, and add CS1's error-checking and other features. –
Jonesey95 (
talk) 19:36, 23 October 2021 (UTC)
|vol=
to be supported as an alias when the underlying templates don't. |vol=
isn't like |litigants=
which is specific to this use case, but it is just a general way one could refer to volume.In Wayback_Machine#cite_ref-DigitalJournal_31-0 .. a generic title error on the phrase 'Wayback machine' is actually a legit part of the title. -- Green C 00:58, 29 October 2021 (UTC)
|website=
)
{{cite web |first=Alexander |last=Baron |title=((The new Internet Archive Wayback Machine now online)) |url=http://www.digitaljournal.com/article/360776 |website=Digital Journal |date=October 23, 2013 |access-date=November 19, 2020 |archive-date=November 19, 2020 |archive-url=https://web.archive.org/web/20201119071411/http://www.digitaljournal.com/article/360776}}
Are there any plans to link the citation templates with Wikidata? I was thinking of 2-way connections:
To confirm practicality, I made a very crude template at User:Aymatth2/citeQ to pull values from Wikidata. There is no error checking, but it seems to work:
Code | Renders |
{{User:Aymatth2/citeQ |Q25169 |page=123}} | Douglas Adams, Eoin Colfer (1979), The Hitchhiker's Guide to the Galaxy, p. 123 |
{{User:Aymatth2/citeQ |Q4386569 |page=34}} | Beatrix Potter (October 1903), The Tailor of Gloucester, Frederick Warne & Co., p. 34 |
{{User:Aymatth2/citeQ |Q313030 |page=456}} | Edward Gibbon (1776), The History of the Decline and Fall of the Roman Empire, p. 456 |
The advantage would be complete and consistent source descriptions rendered from a single vetted Wikidata entry. The citations would be the same across all articles that use the source apart from page number. Error messages or hidden categories could be generated when the Wikipedia values did not match the Wikidata values, so they could be tracked down and corrected. I am sure there are all sorts of complexities: Books have different editions, journals get new publishers, articles are spread over multiple magazine editions, etc.. But is there any reason why we would not work towards implementing something like this? Or is it in the works already? Aymatth2 ( talk) 14:03, 3 October 2021 (UTC)
{{
cite Q}}
?
48Pills: in
this edit, alias of 'Lay summary'
is not correct. Please change it back to the actual parameter name. –
Jonesey95 (
talk) 05:24, 3 October 2021 (UTC)
We should just
deprecate and remove |lay-date=
, |lay-format=
, |lay-source=
, and |lay-url=
. I have marked these parameters as deprecated in the ~Whitelist/sandbox and will change our documentation to reflect that state.
Of course, now that I've done that, I expect that somebody's knickers will get in a twist and I'll all end up at some drama board. Those parameters are not amenable to replacement by bot because some human must decide if they are important to the en.wiki article and then create a separate cs1|2 template for those sources. Creating a maintenance category is possible but very, very few of us even know that maintenance categories exist so it will be years before the last |lay-<param>=
is removed (if ever).
— Trappist the monk ( talk) 12:16, 3 October 2021 (UTC)
When I use the issn= parameter a link to worldcat is automatically generated. Some ISSN values are valid and in portal.issn.org but not WorldCat. Example: https://portal.issn.org/resource/ISSN/2531-4661 vs. https://www.worldcat.org/issn/2531-4661 . Is there any way to control the automatically-generated link or just disable it? Thanks Jamplevia ( talk) 21:52, 30 October 2021 (UTC)
|issn=
is a low value parameter. There are other opinions expressed at
WP:ISSN. Apparently, portal.issn.org does not aid a reader of an en.wiki article in locating a copy of the periodical so, from that perspective, is of little use to our readers.}}
:
[https://portal.issn.org/resource/ISSN/2531-4661 ISSN 2531-4661 at issn.org]
→
ISSN 2531-4661 at issn.orgYou recently edited
Langbeinites a couple times to replaced UTF numeral sub/superscript characters with either {{
chem2}} or HTML <sup>...</sup>
or <sub>...</sub>
in the |title=
field in {{cite}}
templates. In both cases, this is not recommended because many fields of the various {{cite}}
templates generate COinS metadata, which is used for citation cross-compatibility on the Internet, beyond just Wikipedia. See
Template:Citation Style documentation/coins for {{cite}}
fields that are COinS-producing. —
sbb (
talk) 12:58, 15 June 2021 (UTC)
@
Beland: (I outdented my reply because some of the formatting I used doesn't like to be part of the wikitext :
indentation). Well, since COinS strings are emitted entirely as the value of the |title=
parameter in empty HTML <span></span>
tags, the only thing allowed in COinS strings is what can be in HTML attribute values. That's pretty much plain ASCII and URL-escaped entities. As an example, I created 3 references to a fake {{cite book}}
reference titled H2O and r2, using 3 different ways to markup the super- and subscripts (note also the that the r is italicized with wiki markup):
Generated COinS data
ref 1:
<span title="ctx_ver=Z39.88-2004& rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook& rft.genre=book& rft.btitle=H%3Csub%3E2%3C%2Fsub%3EO+and+r%3Csup%3E2%3C%2Fsup%3E& rft.date=2021& rft.au=sbb& rfr_id=info%3Asid%2Fen.wikipedia.org%3AUser%3ASbb%2Fsandbox" class="Z3988"> </span>
ref 2
<span title="ctx_ver=Z39.88-2004& rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook& rft.genre=book& rft.btitle=H%26%238322%3BO+and+r%26sup2%3B& rft.date=2021& rft.au=sbb& rfr_id=info%3Asid%2Fen.wikipedia.org%3AUser%3ASbb%2Fsandbox" class="Z3988"> </span>
ref 3
<span title="ctx_ver=Z39.88-2004& rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook& rft.genre=book& rft.btitle=H%E2%82%82O+and+r%C2%B2& rft.date=2021& rft.au=sbb& rfr_id=info%3Asid%2Fen.wikipedia.org%3AUser%3ASbb%2Fsandbox" class="Z3988"> </span>
Note that in ref1, the plain HTML <sub>2</sub>
and <sup>2</sup>
are URL-escaped, telling anybody who consumes/uses that COinS string that the book's title is "H<sub>2</sub>O ...
". It puts the constraint on the resource consumer to correctly parse HTML. Same situation with ref2, only instead of having to parse HTML <sub>
and <sup>
tags, they have to parse HTML entities. Still requires HTML parsing.
Only the last one, ref3, doesn't require HTML parsing, because the URL-escaped Unicode characters will be correctly interpreted.
Having said all that, note that wikitext is stripped from the data during Wikipedia's COinS generation. So no italicization, bolding, etc., get emitted into the COinS strings. This means that something like a title like, "Book about USS Iowa", will get interpreted as Book about USS Iowa
.
— sbb ( talk) 19:49, 15 July 2021 (UTC)
it looks like both HTML and Unicode subscripts go through the system intact [...]I wouldn't think of it that way. Per the OpenURL spec [1], "Recognizing the international environments in which ContextObjects will be used, the Committee selected Unicode as the abstract character repertoire for ContextObjects." The data is represented by Unicode, and encoded as UTF-8. An OpenURL parser is required understand Unicode, so a Unicode subscript character's representation is consistent. But parsers aren't required to then interpret the received Unicode string as partial HTML markup. So an HTML substring is just that: some characters in the ASCII-range that may or may not be HTML, and aren't required to be parsed as such.
Use of templates within the citation template is discouraged because many of these templates will add extraneous HTML or CSS that will be included raw in the metadata. Also, HTML entities, for example , –, etc., should not be used in parameters that contribute to the metadata.
|title=
, etc. fields. —
sbb (
talk) 21:04, 11 September 2021 (UTC)
Do not use the Unicode subscripts and superscripts ²and ³, or XML/HTML character entity references (² etc.).). I started that discussion several months ago, and it didn't gain much traction: Wikipedia talk:Manual of Style/Superscripts and subscripts § Add exception to allow Unicode super/subscripts in COinS fields in cite xxx templates? — sbb ( talk) 22:52, 11 September 2021 (UTC)
@ Trappist the monk: Greetings! To answer your question raised in this revert, Sbb started a thread at User talk:Beland#Use of Templates, HTML, and HTML entities within citation templates. I think that happened because I was going around changing articles (including citations) to conform with MOS:FRAC and Wikipedia:Manual of Style/Superscripts and subscripts, and the current guidelines result in HTML markup instead of Unicode precomposed fractions, superscripts, and subscripts. I couldn't find an authoritative COinS specification that explains how to handle superscripts, fractions (including those not available as precomposed characters), italics, and other markup in fields. I thought Sbb was advocating without opposition that Unicode characters be used instead of markup, and I was starting to change the guidelines to reflect that when we got your attention. Sbb also pointed out there has been opposition at Wikipedia talk:Manual of Style/Superscripts and subscripts. So, it would be good to discuss so I can get some clarification on what the consensus is here so I can update my spellcheck code and guideline pages if necessary. There are several possibilities for what to do:
<sup>...</sup>
etc.Thoughts? -- Beland ( talk) 02:59, 12 September 2021 (UTC)
title="..."
attribute of an empty HTML <span>...</span>
element that also has the attribute class="Z3988"
. HTML attributes cannot contain markup of any kind, so if it can't be sanitised to remove the markup, it must be omitted in the first place. --
Redrose64 🌹 (
talk) 07:22, 12 September 2021 (UTC)
%3Csub%3E
. It looks like other fields (like the URL of the page) also use percent-encoding, so downstream consumers would be expected to percent-decode out of course? The result of that decoding could be HTML or no-markup Unicode or MathML or whatever. --
Beland (
talk) 16:49, 12 September 2021 (UTC)<sup></sup>
this is easy enough to be parsed correctly even by humans, but most math stuff is more complicated.|descriptive-title=
in addition to the proper title |title=
and if the proper title is too complicated to use for metadata, pass down the descriptive title instead. --
Matthiaspaul (
talk) 09:34, 12 September 2021 (UTC)nowrap preventing the horrible line break between double quote and start of title, I haven't seen this yet. Can you provide an example? -- Matthiaspaul ( talk) 09:43, 12 September 2021 (UTC)
index entryyou mentioned, how would a work such as in David's example be represented in your classification databases? The question is in regard to the visual appearance as well as how it is encoded there. Is this something that can be derived from the proper title, or is it a descriptive title?
Two things. 1) No unicode characters. Those are a blight, and should be purged on sight. 2) Readers and accurate rendering of information are the priority. If COinS can't handle something, screw COinS. If magic codefu can be done to convert something non-COinS compliant to something COinS compliant behinds the scene (e.g. ''H''<sub>x</sub>20<sup>6</sup>
→ H_{x}20^{6}
or whatever the COinS standard is), great, but it should not require editors to sacrifice accurate rendering.
Headbomb {
t ·
c ·
p ·
b} 14:51, 12 September 2021 (UTC)
A few considerations and a question:
-- Beland ( talk) 17:29, 12 September 2021 (UTC)
{{
cite journal}}
: templatestyles stripmarker in |title=
at position 22 (
help) —
David Eppstein (
talk) 18:57, 12 September 2021 (UTC)
&rft.atitle=Theory+of+free%2C+spin-%7F%27%22%60UNIQ--templatestyles-0000001B-QINU%60%22%27%7F%3Cspan+class%3D%22frac%22+role%3D%22math%22%3E%3Cspan+class%3D%22num%22%3E1%3C%2Fspan%3E%26frasl%3B%3Cspan+class%3D%22den%22%3E2%3C%2Fspan%3E%3C%2Fspan%3E+tachyons
Theory of free, spin-<span class="frac" role="math"><span class="num">1</span>⁄<span class="den">2</span></span> tachyons
Theory of free, spin-<span><span>1</span>⁄<span>2</span></span> tachyons
Theory of free, spin-1⁄2 tachyons
alt=
attribute with PNGs, plain text with TeX, or the contents of <annotation>
elements with MathML), but this obviously doesn't cover all cases. It might be worth trying to further improve this, but we probably also need a |descriptive-title=
to allow editors to specify themselves what should be passed on as metadata.<math display=inline>210=14\times15=5\times6\times7=\binom{21}{2}=\binom{10}{4}</math>
<span class="mwe-math-element"><span class="mwe-math-mathml-inline mwe-math-mathml-a11y" style="display: none;"><math xmlns="http://www.w3.org/1998/Math/MathML" alttext="{\textstyle 210=14\times 15=5\times 6\times 7={\binom {21}{2}}={\binom {10}{4}}}">
<semantics>
<mrow class="MJX-TeXAtom-ORD">
<mstyle displaystyle="false" scriptlevel="0">
<mn>210</mn>
<mo>=</mo>
<mn>14</mn>
<mo>×<!-- × --></mo>
<mn>15</mn>
<mo>=</mo>
<mn>5</mn>
<mo>×<!-- × --></mo>
<mn>6</mn>
<mo>×<!-- × --></mo>
<mn>7</mn>
<mo>=</mo>
<mrow class="MJX-TeXAtom-ORD">
<mrow>
<mrow class="MJX-TeXAtom-OPEN">
<mo maxsize="1.2em" minsize="1.2em">(</mo>
</mrow>
<mfrac linethickness="0">
<mn>21</mn>
<mn>2</mn>
</mfrac>
<mrow class="MJX-TeXAtom-CLOSE">
<mo maxsize="1.2em" minsize="1.2em">)</mo>
</mrow>
</mrow>
</mrow>
<mo>=</mo>
<mrow class="MJX-TeXAtom-ORD">
<mrow>
<mrow class="MJX-TeXAtom-OPEN">
<mo maxsize="1.2em" minsize="1.2em">(</mo>
</mrow>
<mfrac linethickness="0">
<mn>10</mn>
<mn>4</mn>
</mfrac>
<mrow class="MJX-TeXAtom-CLOSE">
<mo maxsize="1.2em" minsize="1.2em">)</mo>
</mrow>
</mrow>
</mrow>
</mstyle>
</mrow>
<annotation encoding="application/x-tex">{\textstyle 210=14\times 15=5\times 6\times 7={\binom {21}{2}}={\binom {10}{4}}}</annotation>
</semantics>
</math></span><img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/4012a8a0261dae95c0a7443dbf67dcb58800df0c" class="mwe-math-fallback-image-inline" aria-hidden="true" style="vertical-align: -1.005ex; width:40.087ex; height:3.343ex;" alt="{\textstyle 210=14\times 15=5\times 6\times 7={\binom {21}{2}}={\binom {10}{4}}}"/>
<span class="mwe-math-fallback-source-inline tex" dir="ltr">$ {\textstyle 210=14\times 15=5\times 6\times 7={\binom {21}{2}}={\binom {10}{4}}} $</span>
<img src="https://wikimedia.org/api/rest_v1/media/math/render/png/4012a8a0261dae95c0a7443dbf67dcb58800df0c" class="mwe-math-fallback-image-inline" aria-hidden="true" style="vertical-align: -1.005ex; width:40.087ex; height:3.343ex;" alt="{\textstyle 210=14\times 15=5\times 6\times 7={\binom {21}{2}}={\binom {10}{4}}}" />
alt=
attribute; for LaTeX we took everything between the paired $...$
; for MathML we took the content of the <annotation>...</annotation>
tag.MATH+RENDER+ERROR
. Except for that, all of the rest of the metadata are correct:
<span ...>...
&rft.genre=article
&rft.jtitle=Publicationes+Mathematicae+Debrecen
&rft.atitle=MATH+RENDER+ERROR
&rft.volume=51
&rft.issue=1%E2%80%932
&rft.pages=175-189
&rft.date=1997
&rft_id=%2F%2Fwww.ams.org%2Fmathscinet-getitem%3Fmr%3D1468225%23id-name%3DMR
&rft.aulast=Pint%C3%A9r
&rft.aufirst=%C3%81kos
&rft.au=de+Weger%2C+Benjamin+M.+M.
</span>
&rft.atitle=
is dependent on the preference settings of the editor who last saved the article:
&rft.atitle=%3Cspan+class%3D%22nowrap%22%3E%7B%5Cdisplaystyle+210%3D14%5Ctimes+15%3D5%5Ctimes+6%5Ctimes+7%3D%7B%5Cbinom+%7B21%7D%7B2%7D%7D%3D%7B%5Cbinom+%7B10%7D%7B4%7D%7D%7D%3C%2Fspan%3E
&rft.atitle=%3Cspan+class%3D%22nowrap%22%3E210%3D14%5Ctimes+15%3D5%5Ctimes+6%5Ctimes+7%3D%7B%5Cbinom+%7B21%7D%7B2%7D%7D%3D%7B%5Cbinom+%7B10%7D%7B4%7D%7D%3C%2Fspan%3E
&rft.atitle=%3Cspan+class%3D%22nowrap%22%3EMATH+RENDER+ERROR%3C%2Fspan%3E
MATH+RENDER+ERROR
in the metadata. Alas, we cannot force editors to use PNG or LaTeX rendering, nor can we force MediaWiki to give us back the ability to extract content from math stripmarkers.|title=
is to have an alternate |math-title=
or some such that requires some sort of special-secret-markup that is not <math>...</math>
tags to wrap whatever would normally be in <math>...</math>
tags so, for example:
|math-title=A title with some text and $210=14\times15=5\times6\times7=\binom{21}{2}=\binom{10}{4}$ and yet more text
|math-title=
and then remove the special-secret-markup and put the result into the metadata. Then, the module would replace the special-secret-markup with actual opening and closing <math>...</math>
tags, and then preprocess a special template that renders the math title. That rendering then goes into |title=
. Yeah, pretty ugly, and I have no idea if it would work.|title=
with a <math>
tag (not all |title=
parameters are associated with cs1|2).|math-title=
that may, or may not, have $
delimited math text. If it finds a matched pair of $
delimiters, it replaces the delimiters with <math display=inline>
and </math>
and then preprocesses that string to get a math rendering that can be used in the citation's title:
{{#invoke:Sandbox/trappist_the_monk/math|math-title|math-title=$210=14\times15=5\times6\times7=\binom{21}{2}=\binom{10}{4}$}}
$210=14\times15=5\times6\times7=\binom{21}{2}=\binom{10}{4}$
|math-title=
might be used in the metadata as-is because the $
delimiters are 'native' to LaTex / TeX.\$
(literal '$' appearing in math text), support for '$' appearing in plain text that is not math text – for |math-title=
, requiring editors to escape '$' when it appears in text that is not math text seems a reasonable restriction for this parameter. No doubt there is other stuff to do with this hack before we consider implementing it in the cs1|2 module suite.|title=
for our local purposes.|text-title=
and my |descriptive-title=
are basically the same idea, except that in his, the contents of |text-title=
would completely replace the contents of |title=
for metadata purposes (similar to how your |math-title=
would replace |title=
for both, our local rendering as well as the metadata), whereas my |descriptive-title=
could be used instead of a normal |title=
(if not given), but could also be combined with |title=
(when both are given). The contents of the descriptive title should be displayed without
text decoration when rendered (not sure if in front or following the normal title if both exist), and should be put into [square-brackets] in metadata to indicate that this is not the original title (probably prefixing the normal title if both exist). The different representation styles would allow to tell them apart when both are displayed or combined into the single &rft.atitle=
or &rft.btitle=
COinS key.|descriptive-title=
would effectively become your |math-title=
when it contains some $TeX$. (And for the rare case, where the $TeX$ stuff should not be interpreted in your suggested way, we have our ((accept-this-as-written)) syntax to indicate this.) This way, the editor would have the flexibility to provide either the |title=
or the |descriptive-title=
(including its special handling for math), or both.|descriptive-title=
in the past, non existing titles, dynamic titles, visual or acoustical only titles, functional titles, alias titles, unrepresentable titles, because too long, in unsupported scripts, or misleading in our context...none
" to display the localized "no title"), and the case where a title does exists, but should not be displayed for some reason (keyword "off
"), for example in an article listing many revisions of a work), but where we would still want to issue the complete metadata for it. Last year, I started to implement this by introducing these keywords to |title=none/off
, but realized we would still need something more like a |descriptive-title=
parameter to specify the title for metadata.|descriptive-title=
is scope creep. What we are trying to solve is the display of math in titles. So we should limit it to such with a name that makes it obvious the purpose of that parameter.
Izno (
talk) 18:02, 15 September 2021 (UTC)
|title=
. It's still a possibility, but when we are now tinkering with the idea of introducing a dedidated |math-title=
, it is important to also think about more general descriptive titles. After all, a title for a textual math representation is some kind of descriptive title. Otherwise, we easily end up with a whole new bunch of special title parameters, something, I think, we both want to avoid. Therefore, it is a valid question how to possibly combine this at least in the design, even if not all parts of the actual solution would be implemented at the same time.Hmm, some sources when ASCIIfying article titles, appear to use TeX-like markup inside special markers like "##" or "$". Examples:
[1]
[2]. And here's an example of <em>...</em>
where we'd probably want to use '', but I think the em tag gets emitted in the final HTML:
[3]. --
Beland (
talk) 23:49, 13 September 2021 (UTC)
|text-title=
parameter to the template to use as the text version of the title, and simultaneously to allow templates like {{
frac}} or {{
nowrap}} or whatever in titles when a text-title is present. That wouldn't address the inability to extract meaningful text from <math> formulas, but I'm sure Citation bot could be persuaded to add text-titles for those. One reason it's a bad idea is that the parameter would only produce invisible markup and therefore there wouldn't be much incentive for editors to make it accurate. —
David Eppstein (
talk) 00:34, 14 September 2021 (UTC)
|title=A {{frac|1|2}} Title
A '"`UNIQ--templatestyles-00000069-QINU`"'<span class="frac"><span class="num">1</span>⁄<span class="den">2</span></span> Title
class="frac"
, class="num"
, and class="den"
are defined. None of that styling is available to readers who consume the citation through the metadata.
Module:Citation/CS1 might remove the stripmarker, all class=
attributes, and any <span>...</span>
tags without attributes:
A <span role="math">1⁄2</span> Title
style=
might be one, and remove other html tags.|title={{nowrap|don't wrap this text}}
<span class="nowrap">don't wrap this text</span>
nowrap
class is defined in
MediaWiki:Common.css. cs1|2 would include this in the metadata:
don't wrap this text
<em>...</em>
is inappropriate use where <i>...</i>
would have been a better choice. Apparently the $
is a standard part of LaTeX and TeX used to delimit the beginning and end of math text; using a standardized delimiter is always better than making up our own delimiters. I've changed my example above to use the $
delimiters.$
delimits in-line math text, which TeX renders in a smaller font size. --
Shmuel (Seymour J.) Metz Username:Chatul (
talk) 14:48, 14 September 2021 (UTC)
$
delimiters are appropriate, right?$ ... $
for inline math and $$ ... $$
for display math is old-school TeX markup. The modern alternative (better for being less ambiguous wrt actual dollar signs, and also with some technical advantages in actual TeX for making it easier to hang hooks in the code) is \( ... \)
for inline math and \[ ... \]
for display math. The Wikimedia developers have vetoed allowing these to be shortcuts for math markup in the Wikimedia codebase, but I suppose that doesn't prevent them from being used in templates that intercept them and convert them to <math display=inline> ... </math>
and <math display=block> ... </math>
respectively. Would this actually work? Can math tags in template output still be expanded, or is math tag expansion only done before the templates are expanded? If this could be done in the existing |title=
parameter, I think that would be better than introducing a new multiplicity of confusing variations of title parameters. —
David Eppstein (
talk) 18:32, 15 September 2021 (UTC)
$ ... $
. It's not a valid reason to avoid \( ... \)
because very few references use that syntax (very likely, zero references) and because if they do we can fall back to the format-as-typed escape codes already used elsewhere in the citation templates. —
David Eppstein (
talk) 22:09, 15 September 2021 (UTC)|math-title=
(or whatever) into the normal |title=
, as David suggests, this would be better from the user's perspective than to introduce a dedicated parameter for this. The question, however, is how conflictive such $TeX$ stuff would be within normal titles. If collisions would be rather rare, we still have our ((accept-this-as-written)) syntax to force the template to take the title verbatim (which is already supported by |title=
to override the removal of end interpunctation).|title=
parameter then the question is how at least text titles for math (which fall under the category of descriptive titles) can be combined with more general descriptive titles interfacewise, so that we eventually need only one new parameter rather than two for semantically close purposes.\( ... \)
delimiters:
{{#invoke:Sandbox/trappist_the_monk/math|math_test2|math-title=Entropy-Based Uncertainty Measures for \(L^2(\mathbb{R}^n),\ell^2(\mathbb{Z})\), and \(\ell^2(\mathbb{Z}/N\mathbb{Z})\) With a Hirschman Optimal Transform for \(\ell^2(\mathbb{Z}/N\mathbb{Z})\)}}
Entropy-Based Uncertainty Measures for \(L^2(\mathbb{R}^n),\ell^2(\mathbb{Z})\), and \(\ell^2(\mathbb{Z}/N\mathbb{Z})\) With a Hirschman Optimal Transform for \(\ell^2(\mathbb{Z}/N\mathbb{Z})\)
<math>...</math>
tags).<math>...</math>
tags in parameter values are expanded into math stripmarkers before cs1|2 gets parameter values. After cs1|2 has rendered the citation, MediaWiki replaces each math stripmarker with its associated expansion. Using $...$
or \( ... \)
instead of <math>...</math>
tags allows us to apply <math>...</math>
tags and then expand them into math stripmarkers (to be replaced by MediaWiki after cs1|2 final rendering) at the time of our choosing.|title=
is that we have to inspect every |title=
value for the \( ... \)
delimiters and it is possible that some title somewhere legitimately uses the TeX delimiters. Inspecting every |title=
value is relatively inexpensive because all we have to look for is the opening \(
delimiter so if Title:find ('\\%(') then ... end
– attempt to convert delimiters to <math>...</math>
tags only when a \(
delimiter is present. I found only
two instances of the opening \(
delimiter; one is vandalism and the other a malformed title. It would not be so simple with the $...$
delimiters so if we proceed with this solution and choose to use $...$
delimiters, implementing |math-title=
along-side |title=
is the better choice.\( ... \)
to mark TeX blocks, for as long as our (( ... ))
wrapping syntax would disable the feature.\( ... \) TeX delimiters experiment removed |
---|
|
warningmessages. If this change is accepted, I expect to remove the parts of Module:Citation/CS1/COinS that decoded the math stripmarker content – it won't be needed.
that decoded the math stripmarker contents, this would not affect the code for SVG and LaTeX math extraction, only for MathML, right?
<math>...</math>
markup in a |title=
parameter, and then publish that article, the live cs1|2 module will create the metadata string for that citation (coins_replace_math_stripmarker()
) using the math settings in my preferences because MediaWiki renders that math image into a stripmarker before cs1|2 gets the content of |title=
. Since the stripmarker was created using my settings, the metadata will be derived from my settings. The resulting metadata are then cached for everyone until some other editor saves the article and their math preference setting is different from mine.\( ... \)
TeX delimiters are introduced, as noted elsewhere in this discussion, an awb or some such script will be required to replace the <math>...</math>
markup which will cause an article refresh and so new metadata using the \( ... \)
TeX delimited wikitext straight from the appropriate parameter. Because we feed the metadata directly from the \( ... \)
TeX delimited wikitext, there is no need (and no ability to) decode a math stripmarker so the code that decoded the math stripmarker content (even if it still worked) will no longer be need so should be removed. If we ever need it, we can always get it back from a previous version of the module.<math>...</math>
markup at least for a while and it would have been convenient for them if they could continue to use it for entries which either do not contribute to the metadata, or to entries contributing to metadata, if they have selected SVG or LaTeX, not MathML. However, this would put the burden to switch to \( ... \)
on the next editor with MathML settings and would also leave the citation source code in a mix of markups, which might not be desirable for other parties which read our wikitext rather than metadata, so yes, I agree, a hard switch is probably the better approach here.a huge number of errors with existing citationsis an accurate description. If these search results are to be believed, there are:
<math>...</math>
tags to \( ... \)
TeX delimiters. That script can be run on the same day that the change goes live (if it goes live) and be done after a couple of hours.|script-=
parameters which become part of the metadata. Are there other parameters ending up in metadata where math-like constructs could show up occasionally? What about the journal name, work, author etc. names and publisher entries?|trans-=
parameters, and |quote=
, |script-quote=
and |trans-quote=
. Should we support the \(...\) syntax there as well as an alternative for syntax compatibility/consistency, or should we insist on <math>
there?\( ... \)
TeX delimiters for math markup. |journal=
? |work=
? |publisher=
? |author=
? I don't think so; at least not until a need has been sufficiently demonstrated. Quick searches for <math>...</math>
tags in those parameters either timed out with no results or returned no results. We should only support one form of math markup.\(...\)
TeX delimiters experiment.Given the new proposed solution for <math>...</math>
markup and the above comments, I'm wondering where we've come down on how to handle simple markup. I see contradictions between editors like "No unicode characters. Those are a blight, and should be purged on sight." vs. "exactly as the source information presents it, 'funny' characters and all". David Eppstein said titles should be formatted "the way the reference formatted it, even when our style guidelines would tell us to use a different style", but then used {{
frac}} instead of ½. Our style guide says to use {{
sfrac}} for science articles, so that seems to satisfy neither the goal of looking consistent with the style of body text nor the goal of being exactly the same as the original document for ease of search.
What are we proposing as the solution for simple markup, like a chemical formula? If we're following the sources exactly, we might use no-markup Unicode vs. <sup>...</sup>
depending on what the original document does, though if it's on paper or PDF it will be impossible to tell. If we're avoiding Unicode compatibility characters, then we still have at least three choices:
{{
cite book}}
: templatestyles stripmarker in |title=
at position 17 (
help) - {{chem2|H2O2}} - Copy-paste: Something about H 2O 2Though it's unclear to me how well any database or web search engine is going to handle the difference between say, "H2O2" as a search parameter and an internally stored "H<sub>2</sub>O<sub>2</sub>". -- Beland ( talk) 17:19, 18 September 2021 (UTC)
{{
chem2}}
; your example renders this mishmash:
'"`UNIQ--templatestyles-0000008D-QINU`"'<span class="chemf nowrap">H<sub class="template-chem2-sub">2</sub>O<sub class="template-chem2-sub">2</sub></span>
<br />
tags:
Something about H
2O
2.
<sub>...</sub>
tags and friends cleanly? A rule like "use HTML sup/sub tags instead of Unicode subscripts and superscripts" would be easy to follow and easy to enforce, so I'm thinking maybe do that for now?{{
frac}}
above at my 11:46, 14 September 2021 post.|title=Naked Gun 33{{sfrac|1|3}}
|title=Naked Gun 33'"`UNIQ--templatestyles-00000092-QINU`"'<span class="sfrac"><span class="tion"><span class="num">1</span><span class="sr-only">/</span><span class="den">3</span></span></span>
templatestyles
stripmarker refers to
Template:Sfrac/styles.css which defines the classes: sfrac
, tion
, num
, den
, and sr-only
. As with {{frac}}
, none of that styling is available to readers who consume the citation through the metadata so for them, the markup is just meaningless clutter.|descriptive-title=
(or |text-title=
per David) as a fallback, so that editors can use fancy stuff in |title=
for pretty local display purposes (without compromising for COinS), while still being able to exactly match a title, if known, as it may be used in an external database (regardless of what representation or transliteration may be used there) so that it can be used as search pattern there as well.filter/cleanup/simplifysolution may not be sufficient. In my sandbox I've hacked some code that removes:
<br />
tags (used in {{
chem2}}
)class=
attributes from <span>
tagsstyle=
attributes from <span>
tagstitle=
attributes from <span>
tags<span>
without attributes and its matching </span>
Naked Gun 33{{code|{{sfrac|1|3}}}}
Naked Gun 33'"`UNIQ--syntaxhighlight-00000099-QINU`"'
{{#invoke:Sandbox/trappist_the_monk/math|span_test|1=Naked Gun 33{{sfrac|1|3}}}}
Naked Gun 331/3
{{chem2|[{(\h{5}C5Me4)SiMe2(\h{1}NCMe3)}(PMe3)Sc(\m{2}H)]2}}
'"`UNIQ--templatestyles-000000A1-QINU`"'<span class="chemf nowrap">[{(η<sup>5</sup>-C<sub class="template-chem2-sub">5</sub>Me<sub class="template-chem2-sub">4</sub>)SiMe<sub class="template-chem2-sub">2</sub>(η<sup>1</sup>-NCMe<sub class="template-chem2-sub">3</sub>)}(PMe<sub class="template-chem2-sub">3</sub>)Sc(μ<sub>2</sub>-H)]<sub class="template-chem2-sub">2</sub></span>
{{#invoke:Sandbox/trappist_the_monk/math|span_test|1={{chem2|[{(\h{5}C5Me4)SiMe2(\h{1}NCMe3)}(PMe3)Sc(\m{2}H)]2}}}}
[{(η<sup>5</sup>-C<sub class="template-chem2-sub">5</sub>Me<sub class="template-chem2-sub">4</sub>)SiMe<sub class="template-chem2-sub">2</sub>(η<sup>1</sup>-NCMe<sub class="template-chem2-sub">3</sub>)}(PMe<sub class="template-chem2-sub">3</sub>)Sc(μ<sub>2</sub>-H)]<sub class="template-chem2-sub">2</sub>
{{chem2|C2H3O2(-)}}
'"`UNIQ--templatestyles-000000A9-QINU`"'<span class="chemf nowrap">C<sub class="template-chem2-sub">2</sub>H<sub class="template-chem2-sub">3</sub>O<span class="template-chem2-su"><span>−</span><span>2</span></span></span>
{{#invoke:Sandbox/trappist_the_monk/math|span_test|1={{chem2|C2H3O2(-)}}}}
C<sub class="template-chem2-sub">2</sub>H<sub class="template-chem2-sub">3</sub>O−2
 
when a <span role="math">
gets eliminated, and when the text before and after a <span>x<br/>y</span>
to be eliminated would be framed in <sub>y</sub>
and <sup>x</sup>
(perhaps only if inside a class="chemf"?).{{chem2|C2H3O2(-)}}
example a bit, and ideally, the stripped markup in that example should look just like the input. The the "2" subscript should bind closer to the "O" than the "-" charge. —
sbb (
talk) 01:22, 24 September 2021 (UTC)
{{
chem2/sandbox}}
input. Why anchor encoding? Before cs1|2 parameter values are added to the metadata, they are percent-encoded so, in its present form, what the metadata will get is:
[Cl4Re\qReCl4](2−)
← {{
chem2/sandbox|[Cl4Re\qReCl4](2−)}}
– anchor encoding
%26%2391%3BCl4ReqReCl4%26%2393%3B%282%E2%88%92%29
– percent encoding of the anchor-encoded input%5BCl4ReqReCl4%5D%282%E2%88%92%29
← [Cl4Re\qReCl4](2−) – percent encoding\q
is treated as an escaped q
so the result is missing the \
(%5C
). This particular {{chem2}}
input needs to be tweaked to escape the back slash:
%5BCl4Re%5CqReCl4%5D%282%E2%88%92%29
← [Cl4Re\\qReCl4](2−) – percent encoding{{chem2}}
accepts standardized so that consumers of the metadata will know what they mean when the metadata are decoded? If not then those symbols need to be replaced with the actual thing that they represent, don't they?MeTaDaTa-OuTpUt:
), but optionally also the (parameter) input (i.e. MeTaDaTa-InPuT:
). This might allow the extractor not only to replace the complete output of a template by its metadata (as in the current {{
chem2}} example) but allow metadata fragments to be inherited from internally called templates instead of having to handle everything monolithically on the level of the outer template (example: {{
chem}}, which internally uses {{
su}} - still thinking about the details...)are the input symbols that {{chem2}}
accepts standardized so that consumers of the metadata will know what they mean when the metadata are decoded?
I guess, this very much depends on the template, so even if this would be a standard notation in this particular case (I don't know if it is), it probably won't be in the general case. However, this is still a demo with the main purpose to illustrate how easy it would be to enhance templates in general. In a proper implementation, {{
chem2}} would probably not just forward its own input as metadata, but actually generate the metadata by processing the input (like it does for its normal output, but) in a form which would be text-only or use only very simple markup. What can be considered to be the best metadata very much depends on the purpose/function of the template. The advantage of this approach would be that the developers or users of the template probably know best what is the optimal text-only metadata that can be generated from the input (developers would program the template to generate the optimal metadata for the context the template is used in, and users would always be able to override it using the |metadata=
parameter), whereas the generic HTML simplifier in CS1/CS2 has no knowledge on the context and semantics and can only simplify based on universal structural rules. 
but ignores  
. I have added a workaround at least to the wrapper:
Module:DecodeEncode.decode.)title=
attribute. We could then use this instead of the actual HTML for metadata purposes (similar to what we do with math SVG and LaTeX extraction). Given that the HTML title= might be used by the template for other purposes already, and that it is also shown to users as tooltip (which might not be desirable if it contains stuff like "[{(\h{5}C5Me4)SiMe2(\h{1}NCMe3)}(PMe3)Sc(\m{2}H)]2
"), I am using the title= attribute only for illustration purposes here and we might find another HTML attribute or establish a special "
steganographic" notation where/how we could transparently hide those entries for possible extraction by CS1/CS2. Templates might even have a standardized optional parameter like |metadata=
to override what the template would otherwise use for this. Templates enhanced this way could get a sticker like "CS1/CS2-compatible" or such. Sure, this would work only for those templates which have been enhanced this way, but all we would have to do now is to specify a standard for this and implement a generic extraction mechanism which would take over whenever CS1/CS2 finds this special HTML attribute/notation in a citation's title. Over the years more and more templates could be adapted accordingly.<span class="MeTaDaTa::[{(\h{5}C5Me4)SiMe2(\h{1}NCMe3)}(PMe3)Sc(\m{2}H)]2">
normal_template_output
</span>
MeTaDaTa
" magic), it would replace the whole span including normal_template_output
(and, if present, also the corresponding stripmarker) by what follows the ::
following the MeTaDaTa
(which probably needs to be encoded in an actual implementation). For a template call like {{chem2|[{(\h{5}C5Me4)SiMe2(\h{1}NCMe3)}(PMe3)Sc(\m{2}H)]2}}
this would result in [{(\h{5}C5Me4)SiMe2(\h{1}NCMe3)}(PMe3)Sc(\m{2}H)]2
. Would the template be called like {{chem2|[{(\h{5}C5Me4)SiMe2(\h{1}NCMe3)}(PMe3)Sc(\m{2}H)]2|metadata=This is a text-only transcription of the chemical formula}}
instead, it would result in This is a text-only transcription of the chemical formula
. |metadata=off/none
would disable the metadata (nothing would be following the "::
" then). If the extractor does not find the triggering magic, or if the extracted data would be an empty string, it would proceed with the HTML simplification demoed above...{{frac/sandbox|1|2|3}}
'"`UNIQ--templatestyles-000000BB-QINU`"'<span class="frac MeTaDaTa::%E2%80%891%C2%A02%2F3" role="math">1<span class="sr-only">+</span><span class="num">2</span>⁄<span class="den">3</span></span>
{{frac/sandbox|1|2|3|metadata=Custom-Metadata}}
'"`UNIQ--templatestyles-000000C0-QINU`"'<span class="frac MeTaDaTa::Custom-Metadata" role="math">1<span class="sr-only">+</span><span class="num">2</span>⁄<span class="den">3</span></span>
{{frac/sandbox|1|2|3|metadata=off}}
'"`UNIQ--templatestyles-000000C4-QINU`"'<span class="frac MeTaDaTa::" role="math">1<span class="sr-only">+</span><span class="num">2</span>⁄<span class="den">3</span></span>
{{sfrac/sandbox|1|2|3}}
'"`UNIQ--templatestyles-000000C8-QINU`"'<span class="sfrac">1<span class="sr-only">+</span><span class="tion"><span class="num">2</span><span class="sr-only">/</span><span class="den">3</span></span></span>
{{sfrac/sandbox|1|2|3|metadata=Custom-Metadata}}
'"`UNIQ--templatestyles-000000CC-QINU`"'<span class="sfrac">1<span class="sr-only">+</span><span class="tion"><span class="num">2</span><span class="sr-only">/</span><span class="den">3</span></span></span>
{{sfrac/sandbox|1|2|3|metadata=off}}
'"`UNIQ--templatestyles-000000D0-QINU`"'<span class="sfrac">1<span class="sr-only">+</span><span class="tion"><span class="num">2</span><span class="sr-only">/</span><span class="den">3</span></span></span>
{{chem2/sandbox|[{(\h{5}C5Me4)SiMe2(\h{1}NCMe3)}(PMe3)Sc(\m{2}H)]2}}
'"`UNIQ--templatestyles-000000D4-QINU`"'<span class="chemf nowrap">[{(η<sup>5</sup>-C<sub class="template-chem2-sub">5</sub>Me<sub class="template-chem2-sub">4</sub>)SiMe<sub class="template-chem2-sub">2</sub>(η<sup>1</sup>-NCMe<sub class="template-chem2-sub">3</sub>)}(PMe<sub class="template-chem2-sub">3</sub>)Sc(μ<sub>2</sub>-H)]<sub class="template-chem2-sub">2</sub></span>
{{chem2/sandbox|C2H3O2(-)}}
'"`UNIQ--templatestyles-000000D8-QINU`"'<span class="chemf nowrap">C<sub class="template-chem2-sub">2</sub>H<sub class="template-chem2-sub">3</sub>O<span class="template-chem2-su"><span>−</span><span>2</span></span></span>
\(...\)
markup so that may go away. Removing certain html markup may or may not be adequate; I don't know, I'm not a chemist so I don't know if the resulting output to the metadata would be at all useful.\(...\)
at all, quite the contrary. He was concerned that issueing a visible error message would upset people, but that was before we discussed alternatives.Theory of free, spin-<span class="frac" role="math"><span class="num">1</span>⁄<span class="den">2</span></span> tachyons
Theory of free, spin-1⁄2 tachyons
<sup>
and <sub>
instead of <span>
s? The sup/sub can be styled for on-Wiki use, and stripped of class attributes that are meaningless to COinS consumers. —
sbb (
talk) 01:26, 24 September 2021 (UTC)
Are there any objections or comments on doing the following to get the ball rolling:
<sub>...</sub>
and <sup>...</sup>
instead of {{
chem}} and {{
chem2}} for chemistry formulas in citations.?
I don't often see more complicated math in citations, but would we want to make a {{
citemath}} that uses <math>...</math>
for now and can be switched over to \(...\) when that change is ready to be deployed? (And then easily changed again later if handling of TeX-like math formulas needs to change.) --
Beland (
talk) 01:19, 23 September 2021 (UTC)
|descriptive-title=
(which would also be useful for many other purposes, as mentioned further above). With this in place, we may need only a few general recommendations how to provide titles instead of having to address this explicitly in the MOS.{{frac|1|2|3}}
|title=
, prompting the current implementation to throw a stripmarker error message:
'"`UNIQ--templatestyles-000000DD-QINU`"'<span class="frac">1<span class="sr-only">+</span><span class="num">2</span>⁄<span class="den">3</span></span>
1+2⁄3
<span role="math">1+2⁄3</span>
1+2⁄3
|descriptive-title=
). Assuming the COinS consuming entity would be able to process HTML, a HTML engine at their end would make this out of the simplified HTML:
1 2/3
existing Unicode superscripts and subscripts, while I agree that the HTML sub- and superscripts look nicer if used in formulas and are generally to be preferred, in non-scientific articles an occasionally interspersed Unicode super- or subscript character in citation titles might not be a bad idea at all. At least they are COinS-safe out of the box and neither require a HTML engine at the receiver's end nor a TeX-savy human to be decoded. I would not use them in technical articles, but also would not want to ban them in non-technical articles. So, it all depends on the context IMO. What does this mean in regard to MOS or more-citation related guidelines? We could offer some generally recommended best practises there, but we should not rule out any of the possible formats in general. And what does that mean in regard to CS1/CS2? We will have to cope with whatever editors throw at us, therefore we probably need all, special \( ... \) markup, HTML simplifier, template internal metadata, and
|descriptive-title=
to cope with all possible cases optimally."hey I'll have that code done in the next couple weeks", the next CS1/CS2 update isn't scheduled yet but I guess it could be in mid-October.
|descriptive-title=
), would allow us to address all aspects of the problem in the best-possible way without putting restrictions on users which templates or math markup they can use in citations, so that they can use what is best (based on their editorial capabilities) to produce the desired nice-looking output in rendered citations, but still would produce (or at least allow to produce) perfectly simplified and semantically correct metadata at the same time.I tweaked this citation today and introduced an extraneous =
character in |newspaper==Duluth News Tribune
. But, I did not see it so the article got published with my error. I expect that I'm not the only one to have done that. So, I've tweaked the extraneous punctuation test:
Wikitext | {{cite news
|
---|---|
Live |
"Lynn Diane (Swapinski) Jurek". Obituaries. =Duluth News Tribune. Duluth, MN. Archived from
the original on 2013-01-21.{{
cite news}} : CS1 maint: extra punctuation (
link)
|
Sandbox |
"Lynn Diane (Swapinski) Jurek". Obituaries. =Duluth News Tribune. Duluth, MN. Archived from
the original on 2013-01-21.{{
cite news}} : CS1 maint: extra punctuation (
link)
|
Extraneous punctuation is not considered an error so the article ends up in Category:CS1 maint: extra punctuation and cs1|2 displays the green maintenance message for those few who have enabled maintenance messaging.
— Trappist the monk ( talk) 18:58, 1 November 2021 (UTC)
I've just formatted a reference to RFC 9134 and have been advised to report the "check |rfc= value" error here. Range checking apparently needs to be updated. ~ Kvng ( talk) 17:33, 2 November 2021 (UTC)
I've made a piece of software called PressPass (
a longer explanation of its features and functions is here, along with the code). Essentially, what it does is automatically generate filled-out {{
cite news}} invocations from
Newspapers.com clippings and search pages. Currently, I am revising some parts of the generation functions, and making a configuration menu (for stuff like, e.g., whether to include access-date
). However, I would like to ensure that the templates it generates are properly formatted.
Here is what it looks like, for this clipping:
<ref name="Charle18031112">{{Cite newspaper|url=https://www.newspapers.com/clip/87466966/public-auction/|date=1803-11-12|page=4|title=Public Auction|newspaper=The Charleston Daily Courier|location=Charleston, South Carolina}}</ref><!-- Sat -->
So far, in the configuration menu, I'm writing features to allow multi-line cite templates, as well as different options for the date output (1969-12-31, 31-12-1969, 1969 Dec 31, 1969 December 31, December 31, 1969"), and the ability to specify whether access-date
, via
, or location
are included.
This is, more or less, all the information exposed to my script from the clipping page. The headline has to be typed in manually by the user since Newspapers.com doesn't have this all scraped. That said, I understand that there's a lot of "best practices" with regard to the ordering of parameters, et cetera (and I can have the date output as whatever). Since I expect this to be used a lot (and have been using it a lot myself, for example in improving Bradford Island to FA), is there anything I should be doing? Is there anything I'm missing? jp× g 20:43, 21 October 2021 (UTC)
|title=
, so please ensure that your tool requires it. If you are adding |access-date=
, remember that it requires |url=
. Some of those date formats are invalid on Wikipedia and in CS1 templates. See
MOS:DATE for valid formats. –
Jonesey95 (
talk) 20:58, 21 October 2021 (UTC)
<ref>
tag name=
attribute and hyphenate the date; consider including the page number and provision for a disambiguator for the cases where multiple articles sharing the same page are cited (<ref name="Charleston 1803-11-12 p4a">
).|via=Newspapers.com
because that is the recommended form at
WP:Newspapers.com and because the clipping is not delivered by the publisher.|location=
should not be displayed except when it is needed to disamblguate the newspaper named in |newspaper=
.|access-date=
is generally not required (the newspaper is not an ephemeral source).|pages=[https://www.newspapers.com/clip/56035464/the-los-angeles-times/ 1], [https://www.newspapers.com/clip/56035546/the-los-angeles-times/ 10]
for as many pages as are necessary; allow for page ranges{{
Cite newspaper}}
to {{
Cite news}}
or maybe {{
Cite web}}
. Some discussions
here and
here. I also don't see the need to note the day of the week, nor do I understand why it's an HTML comment. And regarding date formats: extremely cool, albeit not to be expected, would be if PressPass tried to automatically use the format specified in the page's {{Use xxx dates}}
template, if present. Otherwise, what Jonesey said (22:35). —
JohnFromPinckney (
talk /
edits) 00:21, 22 October 2021 (UTC)
|location=
should not be displayed except when it is needed to disamblguate the newspaper named in |newspaper=
".|location=
is ESSENTIAL unless the name of the city of publication is part of the name of the newspaper". --
Alarics (
talk) 09:33, 22 October 2021 (UTC)We do not permit the use of the YYYY-MM-DD format for dates in the Julian calendar. The newspaper.com site provides coverage for some newspapers that were published before 1752, the year in which the British colonies in North America changed from the Gregorian calendar to the Julian calendar. An example of a newspaper available from newspapers.com for this situation is The Pennsylvania Gazette for the year range 1728 to 1752. The metadata emitted by Citation Style 1 is false for Julian calendar dates. In this situation, I suggest you emit a plain text citation with no template. For example,
[This example illustrates another failing of Citation Style 1. It lacks the ability to use a description as a title, for cases when the author or publisher have not given a story a title.]
Jc3s5h ( talk) 14:51, 22 October 2021 (UTC)
FWIW, saving your clipping into Zotero, then dragging it here from there, using Zotero's Wikipedia exporter function, produces:
{{Cite news| pages = 4| title = Public Auction| work = The Charleston Daily Courier| location = Charleston, South Carolina| accessdate = 2021-10-25| date = 1803-11-12| url = https://www.newspapers.com/clip/87466966/public-auction/}}
While any volunteer is, of course free within reason to work on anything they like, perhaps effort would be better deploying in tweaking that exporter, rather than (if I may) reinventing the wheel? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:53, 25 October 2021 (UTC)
I've made a lot of improvements to the script, and a new version is here; thanks to everyone who helped me out in this discussion! The documentation page covers all of its behavior: let me know if there is anything I missed. jp× g 21:46, 7 November 2021 (UTC)
|author=
& |agency=
. These are not so much discovery parameters, although they may help in finding the correct source faster, but they have a reliability component. An otherwise unreliable source (such as a newspaper that is an official or semi-official organ of a political party, or is state-controlled) may carry a press report from a press agency with proven prior reliability. Or may take through syndication, the column of an author with a history of past objectivity. Such treatment may provide the citation with elevated reliability. In any case, |author=
should probably be included even when there is no byline. This is recommended: |author=<!--Staff writer(s); no byline.-->
.
172.254.222.178 (
talk) 23:17, 7 November 2021 (UTC)|author=<!--Not stated-->
; see
Help:Citation Style 1 § Authors.I wanted to add an example of the use of ref=none. This is the usual form, so I felt an example of its use was appropriate. Unfortunately, there is no talk page for discussion of this, as it links to here Hawkeye7 (discuss) 21:11, 9 November 2021 (UTC)
|ref=none
with the edit summary Example text. I reverted that because
live documentation is not for testing. Editor Hawkeye7 then added new
{{
markup2}}
template with the edit summary NOT a test edit - ref=none needs to be used all the time - documentation needs to show example of this. I reverted again noting that the initial edit was labeled as a 'test' and suggested discussion here.
|ref=none
is noted in the template's
documentation which links to a longer discussion at {{
citation}}
. I'm not sure that yet-another-example at {{
cite encyclopedia}}
is all that beneficial.Template:Cite book/doc states:
Afterword
, Foreword
, Introduction
, or Preface
will display unquoted; any other value will display in quotation marks. The author of the contribution is given in contributor.However, I cannot manage to get any of Afterword
, Foreword
, Introduction
, or Preface
to work. Could someone help?
Veverve (
talk) 13:12, 13 November 2021 (UTC)
{{cite book |contributor=Foreword author |contribution=Foreword |author=Book author |title=Title of the book}}
Can we create an equivalent of Template:Page numbers for video timestamps? I would like to be able to cite minute #, second # of a video, as a reference for a statement. I believe this feature and a "Video timestamps needed" tag would improve the WP:V of videos used as references, of which there are many. LondonIP ( talk) 23:44, 13 November 2021 (UTC)
{{cite AV media|url=https://www.youtube.com/watch?v=ctoF5Ctc0ZM|title=Day at Night: Muhammad Ali, legendary boxing champion|time=21:50}}
The Egyptologist Dorothea Arnold doesn't have an article here, but does in three other Wikipedias. Unsurprisingly,
doesn't work. Is there a workaround (or am I missing something obvious)? -- Hoary ( talk) 01:23, 14 November 2021 (UTC)
{{Cite book |last=Arnold |first=Dorothea |author-link=:d:Special:EntityPage/Q1246153#sitelinks-wikipedia |last2=Allen |first2=James P. |last3=Green |first3=L. |url=https://books.google.com/books?id=sGLFwVkljQMC&pg=PA135 |title=The Royal Women of Amarna: Images of Beauty from Ancient Egypt |date=1996 |publisher=Metropolitan Museum of Art |isbn=978-0-87099-816-4 |language=en}}
Template:Cite web/Danish is listed as auto subst but has 165 transclusions. Any idea why it isn't being subst? Gonnym ( talk) 20:10, 15 November 2021 (UTC)
{{
Kilde www}}
a redirect to {{
cite web/Danish}}
? Does
User:AnomieBOT/source/tasks/TemplateSubster.pm understand redirects? I have no experience with Perl so it isn't clear to me from that code. Perhaps Editor
Anomie can provide an answer.<ref>
s (but it won't subst inside <nowiki>
, which makes {{
row numbers}} annoying). You're right that having over 100 transclusions will prevent substing, as a safety measure against some vandal finding a template with thousands of transclusions and having the bot subst them all. 100 seems a decent cutoff for someone to manually fix should a bad substing run need to be reverted.
Anomie
⚔ 22:02, 15 November 2021 (UTC)
|date=
parameters, if anyone is interested in troubleshooting. The code is too nested and strange for me to parse and does not catch invalid dates. –
Jonesey95 (
talk) 23:30, 15 November 2021 (UTC)
I was following this discussion with interest, but I see now it's been archived with nothing being done? Given the unanimous consensus there to make the proposed changes to the "volume" output, and since no-one objected to Kanguole's sandbox edits, is there any reason this can't be implemented right away? Dan from A.P. ( talk) 10:04, 21 November 2021 (UTC)
I have added 'author', 'collaborator', 'contributor', 'editor', 'translator' as bogus names. Yeah, this will break the convenient |author=Author
|editor=Editor
etc demo uses but these exist in article-space templates where they do not belong. We will just have to be a little more creative in our demos.
— Trappist the monk ( talk) 16:22, 22 November 2021 (UTC)
|author=XYZ, Inc.
replaced by |last=XYZ
|first=Inc
. Then there are authors who are individual people, but whose names do not follow the Western "forename surname" convention, so I might use |author=Ban Ki-moon
for which both |last=Ban
|first=Ki-moon
and |first=Ban
|last=Ki-moon
are just plain wrong. This has been discussed before. --
Redrose64 🌹 (
talk) 22:40, 22 November 2021 (UTC)
{{cite book/new |author=XYZ, Inc. |title=Title}}
{{cite book/new |author=Ban Ki-moon |title=Title}}
I am not sure if here is the correct location for my question so please forgive my trespass. I read myself around in circles a while to see if this had been discussed before and the only thing I found that was close is THIS which was about display of titles in italics vs quotes. It is why I felt this best posted here since it is of a similar type subject.
These are very different documents in scope and purpose and it feels erroneous that the difference isn't displayed accurately in the citations. Has there already been a discussion/consensus on this issue that I missed, or is it possibly a simple oversight and easy fix?
With best regards.
---
Darryl.P.Pike (
talk) 20:40, 23 November 2021 (UTC)
|type=
to display something other than the default "Thesis". –
Jonesey95 (
talk) 21:01, 23 November 2021 (UTC)
{{
Cite thesis/doc}}, in the Template Data section, shows an error, "|degree=
is not a valid parameter". I think that error message, which appears to be caused by
Module:Cs1 documentation support, is incorrect. |degree=
appears to be a documented and working alias of |type=
. It looks like something needs to be tweaked. –
Jonesey95 (
talk) 21:06, 23 November 2021 (UTC)
|degree=
is defined as a parameter that is unique to {{
cite thesis}}
|degree=
does not have any aliases|type=
as an alias of |degree=
; it is not, |degree=
modifies the template-specific default TitleType
metaparameter unless overridden by |type=
|type=
does not have |degree=
as an aliasI find it's distracting and a bit confusing about the place of publisher in journal article citation. For example:
I think journal volume and issue should go immediately right after journal name, instead of publisher inserted between them. Like, the above example should read:
This would make it more logical and smooth to read. Can someone please explain why we put the publisher in the place like we're doing? Any specific citation rule that I'm not aware of? Thanks a lot. Sorry for my bad English. 2604:3D08:4E7F:F7E0:952C:82DC:1A4F:9CD1 ( talk) 16:51, 28 November 2021 (UTC)
|publisher=
in {{
cite journal}}
. Yeah, the documentation sucks. But at
Help:Citation Style 1 § Work and publisher under the Publisher bullet is this:
Wikipedia:Village pump (proposals)#rfc: shall we update cs1/2?
— Trappist the monk ( talk) 23:13, 28 November 2021 (UTC)
Does someone know how to edit the TemplateData for this? It should be something like the following. Thanks.
"suggestedvalues": [ "live", "dead", "unfit" ]
— Michael Z. 01:07, 29 November 2021 (UTC)
{{
cite book}}
appears to have something like what you are suggesting; see
Template:Cite book/TemplateData{{
cite web}}
in the visual editor, and the field has a dropdown, but it’s not populated with values (if I enter a valid one, then only it appears). Clicking on the template's documentation brought me here. —
Michael
Z. 01:38, 29 November 2021 (UTC)
{{
cite web}}
and {{
cite book}}
appear to have more-or-less the same "suggestedvalues"
values (order is different); I doubt that order makes a difference. There are other obvious differences where one template has something that the other template does not: {{cite book}}
has "default"
; {{cite web}}
has "autovalue"
and "suggested"
. I suppose that these might make a difference but I don't know. Have you tried an experiment using the {{cite book}}
template instead of {{cite web}}
?Is there a reason why double quotation marks in such cases are not automatically displayed as single quotation marks? E.g.
{{cite web |title=Title with "quotation marks" in it |url=http://www.example.com/}}
Cheers – Finnusertop ( talk ⋅ contribs) 08:29, 29 November 2021 (UTC)
|title=
includes at least one double quote mark so the search includes templates like {{
cite book}}
that don't wrap |title=
in quote marks and ignores |chapter=
(and aliases) parameters that do.There's always been a weird behaviour when |collaboration=
is set. E.g.
{{Cite book |last1 = Van Dijk |first1 = Peter Paul |last2 = Iverson |first2 = John |last3 = Shaffer |first3 = H. Bradley |last4 = Bour |first4 = Roger |last5 = Rhodin |first5 = Anders |collaboration=Turtle Taxonomy Working Group |year = 2012 |chapter = Turtles of the World, 2012 Update: Annotated Checklist of Taxonomy, Synonymy, Distribution, and Conservation Status |title = Conservation Biology of Freshwater Turtles and Tortoises |doi = 10.3854/crm.5.000.checklist.v5.2012 |isbn = 978-0965354097}}
gives
but it should instead give
Et al should not be applied automatically when collaborations are set. Headbomb { t · c · p · b} 06:09, 4 December 2021 (UTC)
The majority of cases would have the et al. though. Headbomb {talk / contribs / physics / books} 02:16, 26 December 2015 (UTC)), and we're now some 6 years down the road. Has your assessment given then changed for some reason? Do you want to track down and add it to all the citations that rely on the current behavior? Izno ( talk) 07:38, 4 December 2021 (UTC)
|collaboration=
which rely on an automatic et al?
Izno (
talk) 08:14, 4 December 2021 (UTC)
|collaboration=
which inappropriately adds et al? The vast majority of uses requiring an et al. to be displayed already have a manually set display-authors. The vast majority of uses which don't have a display authors shouldn't have the et al. There are very, very few citations with a collaboration parameter set that need an automatic et al to be added.
Headbomb {
t ·
c ·
p ·
b} 15:57, 4 December 2021 (UTC)
|display-authors=4
works properly, but |display-authors=5
returns an error. This should be handled at the source. I suppose when/if development on the module collection resumes, this could be tasked, following discussion.
65.88.88.71 (
talk) 16:47, 4 December 2021 (UTC)|display-authors=
cannot be set manually for authors>4, or remove default-value rendering.
65.88.88.71 (
talk) 16:59, 4 December 2021 (UTC)
four authorsdefault?
|collaboration=
does not count author names. The
documentation does say that 'et al.' will be appended to the author-name list when |collaboration=
is used. If you believe that the documentation can be improved, please do so.{{Cite book |last1 = Van Dijk |first1 = Peter Paul |last2 = Iverson |first2 = John |last3 = Shaffer |first3 = H. Bradley |last4 = Bour |first4 = Roger |last5 = Rhodin |first5 = Anders |collaboration=Turtle Taxonomy Working Group |year = 2012 |chapter = Turtles of the World, 2012 Update: Annotated Checklist of Taxonomy, Synonymy, Distribution, and Conservation Status |title = Conservation Biology of Freshwater Turtles and Tortoises |doi = 10.3854/crm.5.000.checklist.v5.2012 |isbn = 978-0965354097|display-authors=4}}
{{Cite book |last1 = Van Dijk |first1 = Peter Paul |last2 = Iverson |first2 = John |last3 = Shaffer |first3 = H. Bradley |last4 = Bour |first4 = Roger |last5 = Rhodin |first5 = Anders |collaboration=Turtle Taxonomy Working Group |year = 2012 |chapter = Turtles of the World, 2012 Update: Annotated Checklist of Taxonomy, Synonymy, Distribution, and Conservation Status |title = Conservation Biology of Freshwater Turtles and Tortoises |doi = 10.3854/crm.5.000.checklist.v5.2012 |isbn = 978-0965354097|display-authors=5}}
{{
cite book}}
: Invalid |display-authors=5
(
help)|display-authors=
and |collaboration=
.
65.88.88.69 (
talk) 20:28, 4 December 2021 (UTC)|display-authors=etal
is a de facto default when |collaboration=some value
.
65.88.88.69 (
talk) 21:37, 4 December 2021 (UTC)
{{
citation/core}}
, there was provision for no more than 9 authors, but the ninth (if supplied) was never displayed. By default, the first 8 were always displayed, and if you supplied either |last9=
or |author9=
, the first 8 would be followed by "et al". This cut-off could be adjusted by means of the |display-authors=
parameter, which accepted an integer in the range 1-8, so you could show fewer than 8 (but not more) before the "et al". Regardless of the number actually displayed, all 9 would be put into the COinS. All this changed in 2013 when
Module:Citation/CS1 was introduced. --
Redrose64 🌹 (
talk) 23:27, 4 December 2021 (UTC)|collaboration=
in that citation? My gut reaction is that the collaboration's name should be included and the list of individual names truncated to one or a few primary names because I suspect (without any evidence to support this) that it will be easier for a reader to locate the source by primary authors and the name of the collaboration than by the names of all n author names (when n is a relatively large number).|collaboration=
without any |author=
, |last=
, or |vauthors=
is silently ignored:
{{cite book |collaboration=The Writers Group |title=Title}}
→ Title.|collaboration=
requires at least one author name so templates without that name should declare a missing-name error.|display-authors=etal
is the default. It should not be so, per
POLA.
Headbomb {
t ·
c ·
p ·
b} 19:04, 4 December 2021 (UTC)
A commitment to fixing uses after a change can help ensure the change gets made in the first place.
|display-authors=etal
is the default. It should not be so.
65.88.88.71 (
talk) 15:18, 4 December 2021 (UTC)Neither of us have any way of detecting problem cases.is true, but that also means we have no data to support any suggested implementation. I do agree that the current default, "et al", probably reflects more citations to collaborations than the "we named all named authors so we don't need the et al" case.
|display-authors=
to be set to some reasonable value/key to indicate that no et al should be displayed from the current default.|authorn=et al.
as the last author in an arbitrarily shortened author list? Or is ""et al." a bogus name? Such citation may be convenient but it is not correct. The correct form is to give all authors as they appear on the source (subject to module limits) and then optionally truncate the list by using |display-authors=
. This way both verifiability and attribution ate satisfied. Any other concern (additional text, more cumbersome editing, etc) comes in a distant last.
65.204.10.232 (
talk) 15:11, 5 December 2021 (UTC)
|collaboration=
and all authors in the collaboration are also provided in the citation of interest.anything stopping an editor from eschewing display-authors and addingYes, an error displays: Last1; et al. Title.
{{
cite book}}
: Explicit use of et al. in: |last2=
(
help)CS1 maint: numeric names: authors list (
link), subsequently cleaned up by gnomes.Note, for example,
Template:Cite Blomefield, which throws a "Check date values in: |publication-date=" error, but the c. works fine for the |year=
parameter. Could the cite book template be modified so "c." is valid for |publication-date=
? =
paul2520
💬 19:13, 5 December 2021 (UTC)
{{
Cite Blomefield}}
consider removing the 'when written' date (currently in |year=
) and replacing |publication-date=
with |date=
. The 'when written' date doesn't really aid a reader in locating a copy of the source and may, in fact, cause some confusion because the 'when written' date appears first in the rendered citation. There has been some discussion here about changing
Module:Citation/CS1 so that |publication-date=
becomes a complete alias of |date=
.