![]() | This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | Archive 2 | Archive 3 |
Short descriptions are sufficiently unfamiliar, that I think it would be good to automatically have a thread on the talk page when one has been added (at least to start with).
Where a short description template has been added to an article, it should be easy enough to get a bot to subst a template on the talk page, explaining what short templates are, saying how they can be seen, and presenting the text that has been added, making clear that it can be modified. Jheald ( talk) 16:53, 16 February 2018 (UTC)
It would be useful to have a rolling update of how many short templates have been added, perhaps broken out by Wikiproject, and/or by some subject classification based on Wikidata.
It would also be nice to be able to view what descriptions have been added for particular sorts of things, with perhaps the option to sort to show the most recent for a particular facet, and/or show recent diffs.
This perhaps needs a tool to watch the recent changes stream, and keep an off-wiki database of current descriptions, and diff ids for recent changes to them, to allow descriptions for groups of pages to be easily extracted and browsed. Jheald ( talk) 17:05, 16 February 2018 (UTC)
I applaud this idea, but doubt I will have the time to contribute. A suggestion: work from DAB (including geodis and hndis) and name pages. They vary in quality, but a well-written one-line description on one of those could give you everything that's needed, and so get the numbers up very quickly. Narky Blert ( talk) 15:40, 20 February 2018 (UTC)
Someone who is a template editor can have a go at embedding the short description template in the disambiguation template and see if it works. · · · Peter (Southwood) (talk): 19:39, 29 March 2018 (UTC)
It bothers me that people are now starting to add short descriptions, but there still seems to be no very good guidance as to what they should look like in particular standard cases, ie what we are ideally aiming for. It seems incredibly hard to get discussion going about this -- what short descriptions are considered good or not so good? Why? In the absence of any central steer, different editors are going to have very different ideas as to what to write, how much to include, what to leave out.
For example:
Which of these is better? Why? Jheald ( talk) 10:22, 22 February 2018 (UTC)
Optimisation ... is less urgent than getting usable descriptions on pages.Well, my hope for this process is that it would be a chance to level-up, a chance to define and quality-control and improve the standard of these short descriptions. I don't see much value in an all-out effort to add short descriptions here, if they're not going to be systematically better or more suitable than what's being served already. Jheald ( talk) 11:25, 23 February 2018 (UTC)
Hi, an interesting project idea. I'd be willing to help if I'm of use but would like some clarification first. I would look primarily at articles dealing with football which should be a pretty standard description between each page. Taking the first player from Category:English footballers for example, Arthur Aaron (footballer), would adding simply English footballer be an appropriate addition or would it need to be more descriptive? Perhaps English footballer who played for Stockport County? Also, would it actually be useful for me to work through potentially thousands of football articles or could a bot simply add English footballer to every player in that category? Kosack ( talk) 10:57, 22 February 2018 (UTC)
tinyurl.com/ya8rjg7w
Wiki NYC is organizing a wiki translation event in April 2018.
As we are looking around for suggested content to translate I thought that this project could curate useful content to recommend for translation.
There would be lots of details to work out both in this WikiProject and in event outreach to make it work, and it probably would not be possible to ready anything for this upcoming event in the next few weeks, but I thought that I would post here to suggest that if this project did get better established then in-person events for new wiki editors could be a way to amplify this project's outcomes and make the content more accessible in more languages. Blue Rasberry (talk) 14:38, 28 February 2018 (UTC)
I've followed the advice overleaf to display short descriptions in desktop view (Monobook here). I'm surprised to find that the display is in lowercase, except for the 1st letter. At SS Zealandia (1910) I see, "Australian cargo and passenger steamship sunk in the bombing of darwin". How can I get a display that's identical to what's written in the article's template? -- Michael Bednarek ( talk) 10:54, 9 April 2018 (UTC)
{{short description| '''short <p> <big>description</big>''' }}
→ {{short description/sandbox| '''short <p> <big>description</big>''' }}
→ I added 9 short descriptions: one for each of the state capitals in New England, and the largest cities in New Hampshire, Maine, and Vermont (the capital and largest city are the same city in Massachusetts and Rhode Island, and the largest city in Connecticut may not stay the largest).
Am I doing it right? Should I continue? HotdogPi 21:43, 28 April 2018 (UTC)
Template talk:Disambiguation#Edit request for inclusion of short description template
Discussion there is getting wider. I created {{ Disambiguation page short description}}, which holds the string that is used on {{ Disambiguation}} and others (hopefully!) for easy modification if necessary. (Please someone protect this; it affects/will affect hundreds of thousands of articles.) That new string template is only added to a couple of disambig templates at the moment; I await some sense of agreement with the principle before changing {{ Disambiguation}} and others to use it.
Outstanding topics include what value is gained from treating "sets" differently, such as the reversion of "type=disambiguation page" on {{ surname}}, which put all of those broadly-dab pages back into Category:Articles with short description. Outriggr ( talk) 04:22, 2 May 2018 (UTC)
<includeonly>{{short description|{{Disambiguation page short description}}|pagetype = Disambiguation page}}</includeonly>
Outriggr (
talk)
04:47, 2 May 2018 (UTC)
I noticed that the template still displays as a hidden block on the rendered page. This isn't really good for long term strategy, with regard to content reuse/alternative engines etc. The only reason to do it is to make it visible for those who don't want to use the gadget to make it visible. We should really deprecate that practice in my opinion, especially since using this template seems to have become the standard. — TheDJ ( talk • contribs) 09:39, 11 May 2018 (UTC)
{{short description|Xyz}}
to {{SHORTDESC:Xyz}}
, which would be a trivial job, as I'm sure you'll agree. However, there may be sufficient advantages to wrapping in a template (ability to add/remove tracking categories, etc.) that it might be better to retain the template and just cut out the hidden text at some point in the future. We could do that at any time, of course, but I'd want to be sure that we wouldn't hinder the process of adding short descriptions to articles. Any thoughts on how we might gauge that? --
RexxS (
talk)
12:12, 12 May 2018 (UTC)
For a project of this scale – generating short descriptions for 5.5 million articles – there does seem a need for more widespread awareness (I've
added a plug in
Community portal), and more communication of what is going on. For example: the project page does include we are working on making infoboxes generate descriptions automatically
, but no more detail than that (until I just added to it). I've been following this recently, but never spotted any announcement that half-a-million placename articles had already had descriptions generated out of {{
Infobox settlement}}
. Well done! But AFAIK no-one said. It just happened one day. Would be good to know if plans are afoot to do similar with other common types of infobox, then we wouldn't waste our time typing in descriptions for articles carrying those types of infobox. A lot of articles, though, have no infobox and there are editors keen to keep it that way. How about the proposal to generate descriptions out of leads – how's that going?
: Noyster
(talk),
08:35, 12 May 2018 (UTC)
This discussion tends to bear me out about awareness. Until the project wins wider acceptance, it may be well for those of us adding SDs by manual editing to avoid annoying people via their watchlists, by binding ourselves to the rule imposed on bot operators: do it only as part of an edit that also alters the appearance of the page as rendered (in this case, meaning "as rendered on desktop"). : Noyster (talk), 09:51, 27 May 2018 (UTC)
Hi all, there's currently a problem with the short description on San Francisco that breaks the page on the iOS app. The short description that's being generated includes a line break, which is causing problems. The short description seen in iOS app search is: "City and County in California in California", and when you open the page, it just shows the lead image and nothing else. Here's a screenshot of what it looks like.
WMF developers are fixing the issue on our end -- our display should be resilient enough to handle accidental line breaks in the description. But as we've been investigating the problem, it's been hard for us to understand what Template:Infobox settlement is doing with the short descriptions.
As far as I can tell, Infobox settlement is pulling information to dynamically create the short description. But I would imagine that the descriptions would use the same format, and they don't. Some examples of the short descriptions I'm currently seeing in the app:
In the Json blob, the San Francisco article says:
I'm also seeing a mistake in the metropolitan areas:
I assume this is pulling information from the infobox itself, so is the infobox for Chicago metropolitan area missing a property that would fill that empty space? It would be great if someone could explain how these descriptions are constructed, so that our devs can help when there's a bug, like the one on the San Francisco page.
Also, when there's an obvious fail like "Los Angeles: Place", what can an editor do to add a better description? I know that Pbsouthwood was concerned a while back about editors not being able to see the short description in the wikitext, so they can edit or update them. Is that still a concern? I looked for anything related to the short description on these pages, and there's no indication of what the description is, or how to fix it. -- DannyH (WMF) ( talk) 22:46, 25 May 2018 (UTC)
short_description =
parameter. However, to meet cases of need like those above I think there should be more obvious means of editing these autogenerated SDs, which are being created from
increasing numbers of other infoboxes as well. The autogenerated SDs cannot be overridden using the
helper script, even though this provides an "Edit" button (discussed
here). Nor can they be overridden using the {{
Short description}}
template unless the template is placed below the infobox, rather than above as recommended (discussed
here). I agree that it would be good to have the SD visible in wikitext, and also in preview mode
: Noyster
(talk),
07:56, 26 May 2018 (UTC)
Do we have a constraint which would notice if there are two {{ short description}} templates, add some red text on the page and possibly add it to a service category?-- Ymblanter ( talk) 08:22, 16 November 2018 (UTC)
Hello everyone. I have developed a way to procedurally generate short descriptions for the 387,816 articles in Category:Articles with 'species' microformats. My method, while not perfect, generally produces a better short description than the Wikidata description. It operates by copying a snippet from the first sentence of the lede that is suitable as the short description. This method relies on the relatively systematic way articles in this category are written - use caution before expanding this beyond this category.
The purpose of this is to explain exactly how a program would generate these summaries:
Download the wikitext for an article in Category:Articles with 'species' microformats
Run the regex string (?<=(. a | an ))(.*?)(?=(\.|,| which | known | found | describe|<ref|\(| native | grow| that | within | from | cause)) on the wikitext
Take the first match generated by this regex, ignore/discard the rest.
This produces the basic short description now we need to clean it up.
Start loop
Run the regex \[\[[^\]]*\| to identify the left side of piped links.
If any matches were found, remove the matched text and repeat the loop
End loop
Run the above loop three more times, replacing the regex lines with the lines below to strip out links, bold, and Italics
Run the regex \[
Run the regex \]
Run the regex [']{2,}
If the string is "Gram-negative" replace it with "Gram-negative bacteria"
Check whether there's a space in the string
if there is not
Add the article to a list for carbon-based intelligence to deal with, then skip the article
Check the length of the string
if length in characters > 70
Add the article to a list for carbon-based intelligence to deal with, then skip the article
else, add {{shortdescription|(the remaining regex match)}} to the article. Include attribution in the edit summary.
The regex looks for a string that is immediately preceded by "Any character, space, lowercase a, space" or "Space, lowercase a, lowercase n, space" The any character is there because the lookbehinds must be of the same length (four characters in this case). It then matches any number of characters (the short description) until the string immediately in front of it is one of several stop codes.
All links and bolding/italics are then stripped out of the short description. If the short description is longer than 70 characters, it is left for a human. If the short description contains no spaces, it's left for a human. Otherwise it is posted at the top of the article in the shortdescription template.
I used Random page in category Articles with 'species' microformats to generate a sample of articles. The article, my procedurally generated short description, and the wikidata description are included.
Article | Procedurally generated description | Wikidata description | Notes | Comments |
---|---|---|---|---|
Profundiconus pacificus | species of sea snail | species of mollusc | A good example of the improvements achievable over a wikidata import. | |
Catocala caesia | moth of the Erebidae family | species of insect | Significant improvement on WD (P) | |
Pterostylis daintreana | species of orchid endemic to eastern Australia | species of plant | endemic should probably be added to the stop codes | I think endemic is acceptable, but you could shorten a bit by substituting "from" for "endemic to" (P) |
Sewa taiwana | moth of the Drepanidae family | species of insect | ||
Lactobacillus pontis | (skipped, added to human list) | species of prokaryote | Algorithm produced "rod-shaped", which gets kicked for a lack of spaces. Bacteria articles are hard on my algorithm. Is there a subcategory I can skip? | |
Ross seal | true seal | species of mammal | Not a big improvement, but not unacceptable. (P) | |
Turner's thick-toed gecko | species of gecko | species of reptile | Not a big improvement as the title alreaddy contains "gecko", like previous example, but also not unacceptable (P) | |
Coleophora sylvaticella | moth of the Coleophoridae family | species of insect | ||
Solirubrobacter pauli | mesophilic Gram-positive and aerobic bacterium | (none) | 46 characters, algorithm got lucky here. | Very compact, quite informative. If anything a bit technical, "and" could be left out, reducing lendth to 42 characters if worth the effort. |
Leucotabanus ambiguus | species of horse flies in the subfamily Tabaninae | species of insect | big improvement on WD, could be improved, but should be good enough. (P) | |
Chersodromus | genus of snakes of the family Colubridae | genus of reptiles | Big improvement on WD (P) | |
Artedius harringtoni | (skipped, added to a list for humans to parse) | species of fish | Algorithm reterned "demersal" which is rejected for lack of spaces. | |
Mitrella blanda | species of sea snail | species of mollusc | ||
Givira aregentipuncta | moth in the Cossidae family | species of insect | ||
Medicorophium | genus of amphipod crustaceans | genus of crustaceans | Improvement on WD. Could be improved, but probably good enough (P) | |
Scrophularia ningpoensis | perennial plant of the family Scrophulariaceae | species of plant | significant improvement on WD (P) | |
Anadasmus sororia | moth of the Depressariidae family | species of insect | ||
Hakea flabellifolia | shrub of the genus Hakea | species of plant | ||
Shrew | small mole-like mammal classified in the order Eulipotyphla | family of mammals | 58 characters | "Classified in" can be reduced to "in" takes it down to 48. This may be a generally applicable modification (P) |
Gascoyne's Scarlet | English cultivar of domesticated apple | apple | nice, probably close to optimum. (P) | |
Barred thicklip | species of fish belonging to the wrasse Family | species of fish | Need to switch order in which I strip links and check regex. | "belonging to" could be reduced to "in" (P) |
Moluccan scops owl | owl found in Indonesia | species of owl |
The things we need to decide:
To move this forward, we need to do a couple of things:
I'm inviting comments on this now - if it looks good, we can get a consensus for it and I'll start refining it.
Cheers, Tazerdadog ( talk) 05:38, 3 June 2018 (UTC)
{{
short description}}
templates added by a human at the top of the article are not overridden by this process – as is already happening with descriptions generated out of infoboxes
: Noyster
(talk),
11:40, 3 June 2018 (UTC)
{{
short description}}
at the top of the article (as a side-note, for the infoboxes a fix is in the pipeline, assuming something is happening with
phab:T193857)
Galobtter (
pingó mió)
11:55, 3 June 2018 (UTC)Ok, I've had some time to mull over @ Galobtter:'s list, and I've arrived at the following basic conclusions.
1) Start by pulling the first three sentences, not just the first sentence. We got quite a few hat notes and abbreviations instead of sentences. I believe this can be done by changing a number, if it's a real problem let me know.
2) Remove parentheses and everything inside them before running my regex. Parentheses were mucking things up, and this is cleaner than my kludge of stopping at an open parentheses. If this is a problem to implement, skip it.
3) Run my updated regex main expression (below). Notable features include adding is to ensure more robust starts. two new start codes, is the and is followed by one, and 5 new endcodes.
(?<=(.. is a |. is an | is the |.... is (?=one )))(.*?)(?=(\.|,| which | known | found | describe|<ref| whose | native | grow| that | within | from | cause| used | and | with |;))
4) Note that single-word short descriptions are filtered out - if the regex returns something without spaces, it defaults to a skip. We don't need to change the way we've been testing, but everyone should bear this in mind when reading the results.
5) Similarly to the previous point, there's also a filter for maximum length. I currently have it set to skip if my regex produces something longer than 70 characters, but commentary on the appropriate length is very welcome. If you want to pitch in but don't want to program, looking at the test data and telling us where to set that character count would be helpful.
Could you please generate a new sample for us Galobtter?
Cheers. Tazerdadog ( talk) 12:13, 3 June 2018 (UTC)
(?<=(.. is a |. is an |. are a | are an | is the |.... is (?=one )))(.*?)(?=(\.|,| which | known | first described | describe|<ref| whose | native | grow| that | in which | from | cause| used | with |;))
Sorry for not doing anything on this for the last few days. I have just manually evaluated the first 100 short descriptions generated at User:Tazerdadog/organism descriptions first hundred. The results were: 18 skips, 4 bad descriptions, 5 OK descriptions, and 73 good/great descriptions.
Definitions:
Good/great: Better than wikidata's generic "Species of plant" style constructions"
OK: Worse than wikidata, but better than nothing. May contain minorly misleading statements, or minor grammar errors
Bad: Worse than nothing - either very misleading, nonsensical, or containing a glaring grammar error.
Skip: The algorithm hit something it wasn't comfortable with and declined to return a short description. Such pages will either receive the wikidata short description or be reserved for a human.
9 of the 18 skips were for being too long. In my evaluation, most of the descriptions rejected for length didn't have any problems except that they were too long. I'm therefore going to bump up the filter from max 70 chars to max 90 chars. This is an area where non-technically inclined people can help me - sort by character count, and give me your opinion on where I should draw the line for "that's so long that it's worse than no short description at all".
I'm going to add a filter for "Contains exactly one space, and was terminated by a comma" These two word constructions are generally part of a list of traits that I can't parse easily - it's safer to just skip them. If this filter had been in place on this pass, two bad descriptions and one good one would have been skipped.
I'm going to add a filter for short descriptions ending in "ing" Such endings are more likely to be grammatically incorrect. In this run, this filter would have skipped one bad description.
I'm going to add in the two new filters and modify the existing one,and plow through evaluating another 100 short descriptions.
At this point, I'd like to get a conversation going about the acceptable error rate for fully automatic posting. I think I'm close to that point, but we need a strong consensus that we're there before we go into a BRFA.
My opinion: On the articles that the bot doesn't skip, 80% must be better than wikidata (currently 73/82, 89.0%), and no more than 3 percent can have problems so severe that they're worse than nothing (currently 4/82, 4.9%) Either a 95% confidence interval, or a sample size of 250 is sufficient. (95% CI is pretty much the scientific standard, 250 sample size is the most I'd consign anyone to slog through checking my classifications.) Tazerdadog ( talk) 11:23, 8 June 2018 (UTC)
So you guys may have already noticed, but I created a project banner {{ WikiProject Short descriptions}} and a project category Category:WikiProject Short descriptions to make it easier to keep track of relevant pages under this project's scope. I also tweaked most of the categories populated by {{ short description}} to standardize them and ensure they weren't orphaned. (BTW, if any of you can figure out a better image that encapsulates the concept of "short descriptions", feel free to tweak the banner...) Cheers. — AfroThundr ( u · t · c) 04:08, 3 August 2018 (UTC)
Can the short description of an article be accessed and used in a disambiguation page somehow? -- Gonnym ( talk) 13:42, 19 September 2018 (UTC)
{{
short description|Natural number, composite number}}
for
14 (number), which improves on the bare "Natural number" that the infobox would supply. Because the manual description replaces, not supplements the infobox-generated one, it really is no sort of CONTENTFORK at all: see
the page info for 14 (number)] if you need to be convinced. It is never bad practice to use manual methods when an automatic method doesn't give the required result, and that applies to both the editing and coding perspectives. I'm sorry my suggested solution didn't please you, but it does solve your issue. --
RexxS (
talk)
17:10, 19 September 2018 (UTC)
How do I find articles that need a short description? For example, I came across this article in the Featured section - Aggie Bonfire - and there is no short description template on the source page. Does that mean I can add one on it? Likewise for the page Beer Festival, thanks in advance - Vinvibes ( talk)
Editors are beginning to use {{Annotated link}} on disambiguation pages, which is great, but the formatting does not match MOS:DAB guidelines. For example, in Ben Nevis (disambiguation), three of the See also items have been added using {{Annotated link}}, and have the following format:
According to MOS:DAB, the format should be:
Specifically, a comma instead of a hyphen after the article title and the following word in lower case (if non a proper name). Is it possible to adjust the template or use parameters to affect these changes? It seems like a small detail, but disambiguation pages are designed to be quickly scanned and that is easier if there is a consistent format. Thanks, Leschnei ( talk) 17:37, 27 December 2018 (UTC)
{{Annotated link/sandbox|Nevis Radio|style=y}}
->
Nevis Radio – Community radio station in Fort William, Scotland{{Annotated link/sandbox|Nevis Radio|style=y|word=the}}
->
Nevis Radio – Community radio station in Fort William, Scotland{{Annotated link/sandbox|Nevis Radio|style=y|word=yes}}
->
Nevis Radio – Community radio station in Fort William, Scotlandparameter names can be changed, just needed something fast. -- Gonnym ( talk) 18:34, 27 December 2018 (UTC)
Hello, I've nominated this category to be renamed to Category:Articles with short descriptions (plural). Would you please offer input at Wikipedia:Categories for discussion/Log/2018 December 31? Thank you. Nyttend ( talk) 02:22, 31 December 2018 (UTC)
We now have about 672,000 articles with a short description input or generated within Wikipedia, against a target of 2 million that WMF has set us before the link to Wikidata descriptions is cut.
Almost 60,000 of these articles have had their SD added manually by editors (including imports from Wikidata using the Shortdesc helper). This activity has picked up in pace over the past six months, and is now adding SDs at the rate of about 500 per day, However, relying on this alone, at the present pace it would take about another seven years to hit the 2 million target.
Meanwhile, over 90% of the SDs so far added have been automatically generated from infoboxes, chiefly {{
Infobox settlement}}
. But this side of the project appears to have stalled, with no updates since May to
the lists above of infoboxes currently used as sources or being worked upon.
No-one appears to have yet tackled the 'biggie', which is deriving SDs from {{
Infobox person}}
and all its derivatives.
Petscan says there are about 891,000 BLPs. Just {{
Infobox person}}
is used on
321,000 articles, then {{
Infobox football biography}}
on 161,000, {{
Infobox officeholder}}
on 136,000, and so on. In view of these large numbers, and recalling that much of the original concern about relying on Wikidata descriptions was focussed on BLP issues, it seems to me that deriving SDs from these infoboxes is an essential step forward. The core of these SDs would presumably be Nationality occupation
, or for more specific infoboxes such as {{
Infobox architect}}
it would just be Nationality
architect.
This done, there would still remain plenty of scope for human input in checking, correcting and improving the auto-generated SDs for articles about people : Bhunacat10 (talk), 12:31, 6 January 2019 (UTC)
Nationality occupation
is a pretty bland description, and will seldom be optimum, but it will be good enough for a first approximation, as it is unlikely that anyone will have a strong objection to them because they will comply with BLP policy if the infobox does.Austro-Hungarian novelist.
Austro-Hungarian and Czechoslovakian novelist, short story writer and insurance officer.
19th century novelist.
19th century novelist, short story writer and insurance officer.
19th century Austro-Hungarian novelist.
19th century Austro-Hungarian and Czechoslovakian novelist.
19th century Austro-Hungarian and Czechoslovakian novelist, short story writer and insurance officer.
Could be others, but those seem to me the major ones. Agree that the first nationality and occupation is indeed better in this case. Question is, is this always the case in Infobox Person? -- Gonnym ( talk) 16:37, 6 January 2019 (UTC)
Please see Wikipedia:Bots/Requests for approval/DannyS712 bot. -- DannyS712 ( talk) 06:29, 7 January 2019 (UTC)
{{
short description}}
?Please see the RfD discussion : Bhunacat10 (talk), 11:08, 15 January 2019 (UTC)
Please see Template talk:Infobox album#Adding short description. Galobtter ( pingó mió) 08:05, 20 January 2019 (UTC)
I've proposed making shortdesc helper into a gadget here. Galobtter ( pingó mió) 08:12, 20 January 2019 (UTC)
I noticed a section naming a few sub-tasks including one about scuba diving, which appears to be a personal task 9that is already completed). Can any project member add to the list of personal tasks? I am currently focusing on adding short descs. for archaeology projects and would like to include that if it's allowed. YuriNikolai ( talk) 23:46, 8 February 2019 (UTC)
![]() | This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | Archive 2 | Archive 3 |
Short descriptions are sufficiently unfamiliar, that I think it would be good to automatically have a thread on the talk page when one has been added (at least to start with).
Where a short description template has been added to an article, it should be easy enough to get a bot to subst a template on the talk page, explaining what short templates are, saying how they can be seen, and presenting the text that has been added, making clear that it can be modified. Jheald ( talk) 16:53, 16 February 2018 (UTC)
It would be useful to have a rolling update of how many short templates have been added, perhaps broken out by Wikiproject, and/or by some subject classification based on Wikidata.
It would also be nice to be able to view what descriptions have been added for particular sorts of things, with perhaps the option to sort to show the most recent for a particular facet, and/or show recent diffs.
This perhaps needs a tool to watch the recent changes stream, and keep an off-wiki database of current descriptions, and diff ids for recent changes to them, to allow descriptions for groups of pages to be easily extracted and browsed. Jheald ( talk) 17:05, 16 February 2018 (UTC)
I applaud this idea, but doubt I will have the time to contribute. A suggestion: work from DAB (including geodis and hndis) and name pages. They vary in quality, but a well-written one-line description on one of those could give you everything that's needed, and so get the numbers up very quickly. Narky Blert ( talk) 15:40, 20 February 2018 (UTC)
Someone who is a template editor can have a go at embedding the short description template in the disambiguation template and see if it works. · · · Peter (Southwood) (talk): 19:39, 29 March 2018 (UTC)
It bothers me that people are now starting to add short descriptions, but there still seems to be no very good guidance as to what they should look like in particular standard cases, ie what we are ideally aiming for. It seems incredibly hard to get discussion going about this -- what short descriptions are considered good or not so good? Why? In the absence of any central steer, different editors are going to have very different ideas as to what to write, how much to include, what to leave out.
For example:
Which of these is better? Why? Jheald ( talk) 10:22, 22 February 2018 (UTC)
Optimisation ... is less urgent than getting usable descriptions on pages.Well, my hope for this process is that it would be a chance to level-up, a chance to define and quality-control and improve the standard of these short descriptions. I don't see much value in an all-out effort to add short descriptions here, if they're not going to be systematically better or more suitable than what's being served already. Jheald ( talk) 11:25, 23 February 2018 (UTC)
Hi, an interesting project idea. I'd be willing to help if I'm of use but would like some clarification first. I would look primarily at articles dealing with football which should be a pretty standard description between each page. Taking the first player from Category:English footballers for example, Arthur Aaron (footballer), would adding simply English footballer be an appropriate addition or would it need to be more descriptive? Perhaps English footballer who played for Stockport County? Also, would it actually be useful for me to work through potentially thousands of football articles or could a bot simply add English footballer to every player in that category? Kosack ( talk) 10:57, 22 February 2018 (UTC)
tinyurl.com/ya8rjg7w
Wiki NYC is organizing a wiki translation event in April 2018.
As we are looking around for suggested content to translate I thought that this project could curate useful content to recommend for translation.
There would be lots of details to work out both in this WikiProject and in event outreach to make it work, and it probably would not be possible to ready anything for this upcoming event in the next few weeks, but I thought that I would post here to suggest that if this project did get better established then in-person events for new wiki editors could be a way to amplify this project's outcomes and make the content more accessible in more languages. Blue Rasberry (talk) 14:38, 28 February 2018 (UTC)
I've followed the advice overleaf to display short descriptions in desktop view (Monobook here). I'm surprised to find that the display is in lowercase, except for the 1st letter. At SS Zealandia (1910) I see, "Australian cargo and passenger steamship sunk in the bombing of darwin". How can I get a display that's identical to what's written in the article's template? -- Michael Bednarek ( talk) 10:54, 9 April 2018 (UTC)
{{short description| '''short <p> <big>description</big>''' }}
→ {{short description/sandbox| '''short <p> <big>description</big>''' }}
→ I added 9 short descriptions: one for each of the state capitals in New England, and the largest cities in New Hampshire, Maine, and Vermont (the capital and largest city are the same city in Massachusetts and Rhode Island, and the largest city in Connecticut may not stay the largest).
Am I doing it right? Should I continue? HotdogPi 21:43, 28 April 2018 (UTC)
Template talk:Disambiguation#Edit request for inclusion of short description template
Discussion there is getting wider. I created {{ Disambiguation page short description}}, which holds the string that is used on {{ Disambiguation}} and others (hopefully!) for easy modification if necessary. (Please someone protect this; it affects/will affect hundreds of thousands of articles.) That new string template is only added to a couple of disambig templates at the moment; I await some sense of agreement with the principle before changing {{ Disambiguation}} and others to use it.
Outstanding topics include what value is gained from treating "sets" differently, such as the reversion of "type=disambiguation page" on {{ surname}}, which put all of those broadly-dab pages back into Category:Articles with short description. Outriggr ( talk) 04:22, 2 May 2018 (UTC)
<includeonly>{{short description|{{Disambiguation page short description}}|pagetype = Disambiguation page}}</includeonly>
Outriggr (
talk)
04:47, 2 May 2018 (UTC)
I noticed that the template still displays as a hidden block on the rendered page. This isn't really good for long term strategy, with regard to content reuse/alternative engines etc. The only reason to do it is to make it visible for those who don't want to use the gadget to make it visible. We should really deprecate that practice in my opinion, especially since using this template seems to have become the standard. — TheDJ ( talk • contribs) 09:39, 11 May 2018 (UTC)
{{short description|Xyz}}
to {{SHORTDESC:Xyz}}
, which would be a trivial job, as I'm sure you'll agree. However, there may be sufficient advantages to wrapping in a template (ability to add/remove tracking categories, etc.) that it might be better to retain the template and just cut out the hidden text at some point in the future. We could do that at any time, of course, but I'd want to be sure that we wouldn't hinder the process of adding short descriptions to articles. Any thoughts on how we might gauge that? --
RexxS (
talk)
12:12, 12 May 2018 (UTC)
For a project of this scale – generating short descriptions for 5.5 million articles – there does seem a need for more widespread awareness (I've
added a plug in
Community portal), and more communication of what is going on. For example: the project page does include we are working on making infoboxes generate descriptions automatically
, but no more detail than that (until I just added to it). I've been following this recently, but never spotted any announcement that half-a-million placename articles had already had descriptions generated out of {{
Infobox settlement}}
. Well done! But AFAIK no-one said. It just happened one day. Would be good to know if plans are afoot to do similar with other common types of infobox, then we wouldn't waste our time typing in descriptions for articles carrying those types of infobox. A lot of articles, though, have no infobox and there are editors keen to keep it that way. How about the proposal to generate descriptions out of leads – how's that going?
: Noyster
(talk),
08:35, 12 May 2018 (UTC)
This discussion tends to bear me out about awareness. Until the project wins wider acceptance, it may be well for those of us adding SDs by manual editing to avoid annoying people via their watchlists, by binding ourselves to the rule imposed on bot operators: do it only as part of an edit that also alters the appearance of the page as rendered (in this case, meaning "as rendered on desktop"). : Noyster (talk), 09:51, 27 May 2018 (UTC)
Hi all, there's currently a problem with the short description on San Francisco that breaks the page on the iOS app. The short description that's being generated includes a line break, which is causing problems. The short description seen in iOS app search is: "City and County in California in California", and when you open the page, it just shows the lead image and nothing else. Here's a screenshot of what it looks like.
WMF developers are fixing the issue on our end -- our display should be resilient enough to handle accidental line breaks in the description. But as we've been investigating the problem, it's been hard for us to understand what Template:Infobox settlement is doing with the short descriptions.
As far as I can tell, Infobox settlement is pulling information to dynamically create the short description. But I would imagine that the descriptions would use the same format, and they don't. Some examples of the short descriptions I'm currently seeing in the app:
In the Json blob, the San Francisco article says:
I'm also seeing a mistake in the metropolitan areas:
I assume this is pulling information from the infobox itself, so is the infobox for Chicago metropolitan area missing a property that would fill that empty space? It would be great if someone could explain how these descriptions are constructed, so that our devs can help when there's a bug, like the one on the San Francisco page.
Also, when there's an obvious fail like "Los Angeles: Place", what can an editor do to add a better description? I know that Pbsouthwood was concerned a while back about editors not being able to see the short description in the wikitext, so they can edit or update them. Is that still a concern? I looked for anything related to the short description on these pages, and there's no indication of what the description is, or how to fix it. -- DannyH (WMF) ( talk) 22:46, 25 May 2018 (UTC)
short_description =
parameter. However, to meet cases of need like those above I think there should be more obvious means of editing these autogenerated SDs, which are being created from
increasing numbers of other infoboxes as well. The autogenerated SDs cannot be overridden using the
helper script, even though this provides an "Edit" button (discussed
here). Nor can they be overridden using the {{
Short description}}
template unless the template is placed below the infobox, rather than above as recommended (discussed
here). I agree that it would be good to have the SD visible in wikitext, and also in preview mode
: Noyster
(talk),
07:56, 26 May 2018 (UTC)
Do we have a constraint which would notice if there are two {{ short description}} templates, add some red text on the page and possibly add it to a service category?-- Ymblanter ( talk) 08:22, 16 November 2018 (UTC)
Hello everyone. I have developed a way to procedurally generate short descriptions for the 387,816 articles in Category:Articles with 'species' microformats. My method, while not perfect, generally produces a better short description than the Wikidata description. It operates by copying a snippet from the first sentence of the lede that is suitable as the short description. This method relies on the relatively systematic way articles in this category are written - use caution before expanding this beyond this category.
The purpose of this is to explain exactly how a program would generate these summaries:
Download the wikitext for an article in Category:Articles with 'species' microformats
Run the regex string (?<=(. a | an ))(.*?)(?=(\.|,| which | known | found | describe|<ref|\(| native | grow| that | within | from | cause)) on the wikitext
Take the first match generated by this regex, ignore/discard the rest.
This produces the basic short description now we need to clean it up.
Start loop
Run the regex \[\[[^\]]*\| to identify the left side of piped links.
If any matches were found, remove the matched text and repeat the loop
End loop
Run the above loop three more times, replacing the regex lines with the lines below to strip out links, bold, and Italics
Run the regex \[
Run the regex \]
Run the regex [']{2,}
If the string is "Gram-negative" replace it with "Gram-negative bacteria"
Check whether there's a space in the string
if there is not
Add the article to a list for carbon-based intelligence to deal with, then skip the article
Check the length of the string
if length in characters > 70
Add the article to a list for carbon-based intelligence to deal with, then skip the article
else, add {{shortdescription|(the remaining regex match)}} to the article. Include attribution in the edit summary.
The regex looks for a string that is immediately preceded by "Any character, space, lowercase a, space" or "Space, lowercase a, lowercase n, space" The any character is there because the lookbehinds must be of the same length (four characters in this case). It then matches any number of characters (the short description) until the string immediately in front of it is one of several stop codes.
All links and bolding/italics are then stripped out of the short description. If the short description is longer than 70 characters, it is left for a human. If the short description contains no spaces, it's left for a human. Otherwise it is posted at the top of the article in the shortdescription template.
I used Random page in category Articles with 'species' microformats to generate a sample of articles. The article, my procedurally generated short description, and the wikidata description are included.
Article | Procedurally generated description | Wikidata description | Notes | Comments |
---|---|---|---|---|
Profundiconus pacificus | species of sea snail | species of mollusc | A good example of the improvements achievable over a wikidata import. | |
Catocala caesia | moth of the Erebidae family | species of insect | Significant improvement on WD (P) | |
Pterostylis daintreana | species of orchid endemic to eastern Australia | species of plant | endemic should probably be added to the stop codes | I think endemic is acceptable, but you could shorten a bit by substituting "from" for "endemic to" (P) |
Sewa taiwana | moth of the Drepanidae family | species of insect | ||
Lactobacillus pontis | (skipped, added to human list) | species of prokaryote | Algorithm produced "rod-shaped", which gets kicked for a lack of spaces. Bacteria articles are hard on my algorithm. Is there a subcategory I can skip? | |
Ross seal | true seal | species of mammal | Not a big improvement, but not unacceptable. (P) | |
Turner's thick-toed gecko | species of gecko | species of reptile | Not a big improvement as the title alreaddy contains "gecko", like previous example, but also not unacceptable (P) | |
Coleophora sylvaticella | moth of the Coleophoridae family | species of insect | ||
Solirubrobacter pauli | mesophilic Gram-positive and aerobic bacterium | (none) | 46 characters, algorithm got lucky here. | Very compact, quite informative. If anything a bit technical, "and" could be left out, reducing lendth to 42 characters if worth the effort. |
Leucotabanus ambiguus | species of horse flies in the subfamily Tabaninae | species of insect | big improvement on WD, could be improved, but should be good enough. (P) | |
Chersodromus | genus of snakes of the family Colubridae | genus of reptiles | Big improvement on WD (P) | |
Artedius harringtoni | (skipped, added to a list for humans to parse) | species of fish | Algorithm reterned "demersal" which is rejected for lack of spaces. | |
Mitrella blanda | species of sea snail | species of mollusc | ||
Givira aregentipuncta | moth in the Cossidae family | species of insect | ||
Medicorophium | genus of amphipod crustaceans | genus of crustaceans | Improvement on WD. Could be improved, but probably good enough (P) | |
Scrophularia ningpoensis | perennial plant of the family Scrophulariaceae | species of plant | significant improvement on WD (P) | |
Anadasmus sororia | moth of the Depressariidae family | species of insect | ||
Hakea flabellifolia | shrub of the genus Hakea | species of plant | ||
Shrew | small mole-like mammal classified in the order Eulipotyphla | family of mammals | 58 characters | "Classified in" can be reduced to "in" takes it down to 48. This may be a generally applicable modification (P) |
Gascoyne's Scarlet | English cultivar of domesticated apple | apple | nice, probably close to optimum. (P) | |
Barred thicklip | species of fish belonging to the wrasse Family | species of fish | Need to switch order in which I strip links and check regex. | "belonging to" could be reduced to "in" (P) |
Moluccan scops owl | owl found in Indonesia | species of owl |
The things we need to decide:
To move this forward, we need to do a couple of things:
I'm inviting comments on this now - if it looks good, we can get a consensus for it and I'll start refining it.
Cheers, Tazerdadog ( talk) 05:38, 3 June 2018 (UTC)
{{
short description}}
templates added by a human at the top of the article are not overridden by this process – as is already happening with descriptions generated out of infoboxes
: Noyster
(talk),
11:40, 3 June 2018 (UTC)
{{
short description}}
at the top of the article (as a side-note, for the infoboxes a fix is in the pipeline, assuming something is happening with
phab:T193857)
Galobtter (
pingó mió)
11:55, 3 June 2018 (UTC)Ok, I've had some time to mull over @ Galobtter:'s list, and I've arrived at the following basic conclusions.
1) Start by pulling the first three sentences, not just the first sentence. We got quite a few hat notes and abbreviations instead of sentences. I believe this can be done by changing a number, if it's a real problem let me know.
2) Remove parentheses and everything inside them before running my regex. Parentheses were mucking things up, and this is cleaner than my kludge of stopping at an open parentheses. If this is a problem to implement, skip it.
3) Run my updated regex main expression (below). Notable features include adding is to ensure more robust starts. two new start codes, is the and is followed by one, and 5 new endcodes.
(?<=(.. is a |. is an | is the |.... is (?=one )))(.*?)(?=(\.|,| which | known | found | describe|<ref| whose | native | grow| that | within | from | cause| used | and | with |;))
4) Note that single-word short descriptions are filtered out - if the regex returns something without spaces, it defaults to a skip. We don't need to change the way we've been testing, but everyone should bear this in mind when reading the results.
5) Similarly to the previous point, there's also a filter for maximum length. I currently have it set to skip if my regex produces something longer than 70 characters, but commentary on the appropriate length is very welcome. If you want to pitch in but don't want to program, looking at the test data and telling us where to set that character count would be helpful.
Could you please generate a new sample for us Galobtter?
Cheers. Tazerdadog ( talk) 12:13, 3 June 2018 (UTC)
(?<=(.. is a |. is an |. are a | are an | is the |.... is (?=one )))(.*?)(?=(\.|,| which | known | first described | describe|<ref| whose | native | grow| that | in which | from | cause| used | with |;))
Sorry for not doing anything on this for the last few days. I have just manually evaluated the first 100 short descriptions generated at User:Tazerdadog/organism descriptions first hundred. The results were: 18 skips, 4 bad descriptions, 5 OK descriptions, and 73 good/great descriptions.
Definitions:
Good/great: Better than wikidata's generic "Species of plant" style constructions"
OK: Worse than wikidata, but better than nothing. May contain minorly misleading statements, or minor grammar errors
Bad: Worse than nothing - either very misleading, nonsensical, or containing a glaring grammar error.
Skip: The algorithm hit something it wasn't comfortable with and declined to return a short description. Such pages will either receive the wikidata short description or be reserved for a human.
9 of the 18 skips were for being too long. In my evaluation, most of the descriptions rejected for length didn't have any problems except that they were too long. I'm therefore going to bump up the filter from max 70 chars to max 90 chars. This is an area where non-technically inclined people can help me - sort by character count, and give me your opinion on where I should draw the line for "that's so long that it's worse than no short description at all".
I'm going to add a filter for "Contains exactly one space, and was terminated by a comma" These two word constructions are generally part of a list of traits that I can't parse easily - it's safer to just skip them. If this filter had been in place on this pass, two bad descriptions and one good one would have been skipped.
I'm going to add a filter for short descriptions ending in "ing" Such endings are more likely to be grammatically incorrect. In this run, this filter would have skipped one bad description.
I'm going to add in the two new filters and modify the existing one,and plow through evaluating another 100 short descriptions.
At this point, I'd like to get a conversation going about the acceptable error rate for fully automatic posting. I think I'm close to that point, but we need a strong consensus that we're there before we go into a BRFA.
My opinion: On the articles that the bot doesn't skip, 80% must be better than wikidata (currently 73/82, 89.0%), and no more than 3 percent can have problems so severe that they're worse than nothing (currently 4/82, 4.9%) Either a 95% confidence interval, or a sample size of 250 is sufficient. (95% CI is pretty much the scientific standard, 250 sample size is the most I'd consign anyone to slog through checking my classifications.) Tazerdadog ( talk) 11:23, 8 June 2018 (UTC)
So you guys may have already noticed, but I created a project banner {{ WikiProject Short descriptions}} and a project category Category:WikiProject Short descriptions to make it easier to keep track of relevant pages under this project's scope. I also tweaked most of the categories populated by {{ short description}} to standardize them and ensure they weren't orphaned. (BTW, if any of you can figure out a better image that encapsulates the concept of "short descriptions", feel free to tweak the banner...) Cheers. — AfroThundr ( u · t · c) 04:08, 3 August 2018 (UTC)
Can the short description of an article be accessed and used in a disambiguation page somehow? -- Gonnym ( talk) 13:42, 19 September 2018 (UTC)
{{
short description|Natural number, composite number}}
for
14 (number), which improves on the bare "Natural number" that the infobox would supply. Because the manual description replaces, not supplements the infobox-generated one, it really is no sort of CONTENTFORK at all: see
the page info for 14 (number)] if you need to be convinced. It is never bad practice to use manual methods when an automatic method doesn't give the required result, and that applies to both the editing and coding perspectives. I'm sorry my suggested solution didn't please you, but it does solve your issue. --
RexxS (
talk)
17:10, 19 September 2018 (UTC)
How do I find articles that need a short description? For example, I came across this article in the Featured section - Aggie Bonfire - and there is no short description template on the source page. Does that mean I can add one on it? Likewise for the page Beer Festival, thanks in advance - Vinvibes ( talk)
Editors are beginning to use {{Annotated link}} on disambiguation pages, which is great, but the formatting does not match MOS:DAB guidelines. For example, in Ben Nevis (disambiguation), three of the See also items have been added using {{Annotated link}}, and have the following format:
According to MOS:DAB, the format should be:
Specifically, a comma instead of a hyphen after the article title and the following word in lower case (if non a proper name). Is it possible to adjust the template or use parameters to affect these changes? It seems like a small detail, but disambiguation pages are designed to be quickly scanned and that is easier if there is a consistent format. Thanks, Leschnei ( talk) 17:37, 27 December 2018 (UTC)
{{Annotated link/sandbox|Nevis Radio|style=y}}
->
Nevis Radio – Community radio station in Fort William, Scotland{{Annotated link/sandbox|Nevis Radio|style=y|word=the}}
->
Nevis Radio – Community radio station in Fort William, Scotland{{Annotated link/sandbox|Nevis Radio|style=y|word=yes}}
->
Nevis Radio – Community radio station in Fort William, Scotlandparameter names can be changed, just needed something fast. -- Gonnym ( talk) 18:34, 27 December 2018 (UTC)
Hello, I've nominated this category to be renamed to Category:Articles with short descriptions (plural). Would you please offer input at Wikipedia:Categories for discussion/Log/2018 December 31? Thank you. Nyttend ( talk) 02:22, 31 December 2018 (UTC)
We now have about 672,000 articles with a short description input or generated within Wikipedia, against a target of 2 million that WMF has set us before the link to Wikidata descriptions is cut.
Almost 60,000 of these articles have had their SD added manually by editors (including imports from Wikidata using the Shortdesc helper). This activity has picked up in pace over the past six months, and is now adding SDs at the rate of about 500 per day, However, relying on this alone, at the present pace it would take about another seven years to hit the 2 million target.
Meanwhile, over 90% of the SDs so far added have been automatically generated from infoboxes, chiefly {{
Infobox settlement}}
. But this side of the project appears to have stalled, with no updates since May to
the lists above of infoboxes currently used as sources or being worked upon.
No-one appears to have yet tackled the 'biggie', which is deriving SDs from {{
Infobox person}}
and all its derivatives.
Petscan says there are about 891,000 BLPs. Just {{
Infobox person}}
is used on
321,000 articles, then {{
Infobox football biography}}
on 161,000, {{
Infobox officeholder}}
on 136,000, and so on. In view of these large numbers, and recalling that much of the original concern about relying on Wikidata descriptions was focussed on BLP issues, it seems to me that deriving SDs from these infoboxes is an essential step forward. The core of these SDs would presumably be Nationality occupation
, or for more specific infoboxes such as {{
Infobox architect}}
it would just be Nationality
architect.
This done, there would still remain plenty of scope for human input in checking, correcting and improving the auto-generated SDs for articles about people : Bhunacat10 (talk), 12:31, 6 January 2019 (UTC)
Nationality occupation
is a pretty bland description, and will seldom be optimum, but it will be good enough for a first approximation, as it is unlikely that anyone will have a strong objection to them because they will comply with BLP policy if the infobox does.Austro-Hungarian novelist.
Austro-Hungarian and Czechoslovakian novelist, short story writer and insurance officer.
19th century novelist.
19th century novelist, short story writer and insurance officer.
19th century Austro-Hungarian novelist.
19th century Austro-Hungarian and Czechoslovakian novelist.
19th century Austro-Hungarian and Czechoslovakian novelist, short story writer and insurance officer.
Could be others, but those seem to me the major ones. Agree that the first nationality and occupation is indeed better in this case. Question is, is this always the case in Infobox Person? -- Gonnym ( talk) 16:37, 6 January 2019 (UTC)
Please see Wikipedia:Bots/Requests for approval/DannyS712 bot. -- DannyS712 ( talk) 06:29, 7 January 2019 (UTC)
{{
short description}}
?Please see the RfD discussion : Bhunacat10 (talk), 11:08, 15 January 2019 (UTC)
Please see Template talk:Infobox album#Adding short description. Galobtter ( pingó mió) 08:05, 20 January 2019 (UTC)
I've proposed making shortdesc helper into a gadget here. Galobtter ( pingó mió) 08:12, 20 January 2019 (UTC)
I noticed a section naming a few sub-tasks including one about scuba diving, which appears to be a personal task 9that is already completed). Can any project member add to the list of personal tasks? I am currently focusing on adding short descs. for archaeology projects and would like to include that if it's allowed. YuriNikolai ( talk) 23:46, 8 February 2019 (UTC)