This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | Archive 2 | Archive 3 | → | Archive 5 |
About (for German umlauts -> letter+e; for French, dropped accents): what does that mean ? French accents are perfectly acceptable and accepted in a title, for all I know (e.g. Coup d'état). I don't know about umlauts, but shouldn't at least the affirmation of french accents be removed ? --FvdP 22:16 Feb 19, 2003 (UTC)
Moved from Wikipedia:Village pump:
Maybe we should update the somewhat off topic note in Wikipedia:Naming conventions (use English) "Use Latin-1 (ISO 8859-1) for the title of an article. Note not UTF-8 nor 7 bit ascii. While Western European accents are acceptable, non-Western European accents need to be dropped. (for German umlauts -> letter+e; for French, dropped accents). (Based on the post from Brion)" Docu 07:12 Feb 28, 2003 (UTC)
Current pratice (discussion now moved to Wikipedia talk:Special characters) seems to be to use Latin-1 spellings for the articles and redirect from basic-ascii spellings, i.e. (for German umlauts -> letter+e; for French, dropped accents).
As not everybody has necessarily all characters on their keyboard, I suppose it's ok to create articles with Basic ASCII and have someelse move them later. Wikipedia:Special characters should answer these questions. That article needs updating though, as it was written when the Wikipedia software couldn't handle accents at all Docu 12:37 Mar 2, 2003 (UTC)
Not using accents is crazy in some contexts. In the Irish language, the very word's meaning and pronounciation is created using an accent (usually a fada, as in é (pronounced e fada). Drop the fada and you change the word, meaning and pronounciation. It is ok to create a fada-less redirect, but under no circumstances could you possibly put the main article under what would be a wrongly spelled word. For example, Ireland's fifth president is Cearbhall Ó Dálaigh. Dropping the fadas and you might as well put him in as Karol Odawle for all the sense it would make. A redirect from Cearbhall O'Dalaigh would be OK, but the main article would have to be spelt in the correct manner.
JtdIrL 03:47 Mar 3, 2003 (UTC)
I find it somewhat inconsistent. E.g. Some insist that Danzig should be used instead of Polish Gdansk whenever majority of population was German or when city was part of German states. OTOH, trhoughout the encyclopedia only Vilnius is used instead of Wilno. The same is with L'viv and Lviv used almost consitently instead of Lwow or Lwów. Did that means that policy is to use German names whenever possible and local names in other cases????!? szopen
FWIW, this isn't only a Polish issue either. Ukrainians want to use Kyiv as the name of the city usually known in English as Kiev, considering the latter, as a transliteration from Russian, to be an offensive relic of Russian imperialism. However, Kiev is by far the more common English name, so there is disagreement over whether "common, but possibly offensive" or "official, but very uncommon" should take precedence. See also Talk:Kolkata for previous discussion of Calcutta vs. Kolkata and related. -- Delirium 23:57, Nov 4, 2003 (UTC)
Any tips on naming conventions for non-English organisations? There is temptation by some to translate them where familiar words are used, not just transliterate. Compare Parliament of Sweden with Tweede Kamer or consider Partij van de Arbeid vs alternatives such as "Labour Party (Netherlands)" "Labour Party of the Netherlands", or Front National vs "National Front". Consider also Nederlands Instituut voor Oorlogsdocumentatie vs a translated "Netherlands Institute for War Documentation". ( 20:19, 16 Dec 2003 (UTC)
I'd like to propose a change. We are supposed to use the commonest form in English. How do we know which is the commonest? The Wiki method seems to be to use a crude Google search. Shouldn't we use the form most commonly used by intelligent, knowledgeable, literate speakers of English? Plenty of morons say "Milano" (mangling the pronunciation of course) but the English name is Milan, for example.
Secondly, some Wikipedians will accept the English name for a city, but not for the surrounding area directly named after it. E.g. Seville and its province. I see no logic in this whatsoever. Don't we need an express policy on this? — Chameleon 01:54, 12 May 2004 (UTC)
I found that more and more foreign-made words are being poured into English WP as entries. I don't know if these words make sense to English speakers, and I am wondering if it's good or bad for the developement of English WP? -- Yacht (talk) 10:17, Jul 17, 2004 (UTC)
I can't give you the accurate examples, because English is not my mother tongue, i just come across some words i don't think they are normal English words, or foreign-like words (like Führer, Fribytaren på Östersjön, Yuri, Yuzu etc.) I don't know if they are already widely used in English, or just neologies (aren't there any corresponding English words for them? I don't even know how to read them). I am just worrying if this may happen: every language creates the corresponding synonym entry for that in English.-- Yacht (talk) 16:16, Jul 17, 2004 (UTC)
You could make a case that a large part of English stolen foreign words! 'I', 'Found', 'word' all come to Old English from proto Germanic roots, 'foreign' and 'pour' are from old French, 'made' comes to English from West Germanic, 'entry' is from Middle French. Wiki of course is Hawaiian, 'pedia' probably from Greek, misunderstood by Latin scholars. You were 'wondering', which comes to us through old English from proto Germanic, whether this affects the 'development' (a French word). Interestingly, noone seems to know where 'bad' comes from, but 'good' is another proto-Germanic word. I wouldn't let it keep you up at night. PS. Yacht is from Norwegian! Mark Richards 17:17, 17 Jul 2004 (UTC)
There's quite a well-known and oft-quoted saying: The problem with defending the purity of the English language is that English is about as pure as a cribhouse whore. We don't just borrow words; on occasion, English has pursued other languages down alleyways to beat them unconscious and rifle their pockets for new vocabulary. [2]. Führer is a commonly understood word from the events of the 1930s and 40s; the second example is the title of a Swedish book which I'm unfamiliar with, though the article gives a translation; Yuri I've not come across, while Yuzu apparently is a Japanese fruit, so it's not surprising that there isn't an English name for it. I thought Yacht came from Dutch! [3] -- Arwel 18:33, 17 Jul 2004 (UTC)
the purity of English is one thing, while what i am concerning is another. I hope people don't think English WP is a romanization version of their own languages, and create thousands of entries in their romanized languages that should be in English (just like i created an entry Nie Zi for Crystal Boys, but later, i know there is an English name for it, so i deleted the Nie Zi which may make no sense to English speakers). About Yacht, i think it's a German word, and that's why i chose that. :) -- Yacht (talk) 18:03, Jul 19, 2004 (UTC)
The problem is when there is no anglicisation of a term, the anglicisation is really bad, or the anglicisation has fallen out of favour (for political/historical/cultural reasons). Also, there may be two, or a number of competing anglicisations - which may lead to a preference for the most locally appropriate term.
All in all, you'll have to put up with, and expect, some non-English terms. Besides, as noted above, there's plenty of non-English terms have been absorbed into our vocabulary. ( Coup d'état isn't exactly English!)
zoney ███ talk 19:40, 23 Aug 2004 (UTC)
(I have posted a nearly similar paragraph on Talk:List_of_colleges_and_universities -please tell me politely that this double posting is wrong if it is ; I shall learn through my mistakes)
I had thought of editing some pages related to French universities and institutions, and I feel that some coherence should first be given to the choice of name to be used for each of these.
Reading this discussion page (notably the paragraph about parliaments or political parties) I am rather more puzzled than I could be before.
If I browse through various pro country university lists, I can only discover that there has been no common policy as what to do with university names, e.g.
The page List_of_colleges_and_universities_starting_with_U is a special mess, with its mix of translated names (e.g. "university of Modena") and native names (e.g. "universita degli Studi di Pavia").
I think that some more precise instructions should be precious to future editors. -- French Tourist 21:54, 30 Aug 2004 (UTC)
Many languages can be transliterated ( romanized) by more than one system (e.g. pinyin, Wade-Giles, and others for Chinese). Is this article a good place to keep track of the convention that are used on Wikipedia?
Is it understood that some names that have an established spelling may not be transliterated using the chosen convention? Do any languages use more than one system? With the impending adoption of UTF-8, is it better to just use the native text and IPA?
I'll start a list here from what I know; please add any systems that seem to be commonly accepted.
— Michael Z. 23:53, 2004 Sep 20 (UTC)
Hi, I'm trying to work out what standard policy should be towards diacritic marks in article titles. There doesn't seem to be a standard policy at the moment - or if there is, this page doesn't really give any clear guidance. I do see some discussion above here, but I'm not clear if rough consensus was reached.
I am aware that Wikipedia:Naming conventions (technical restrictions) says we are restricted to using "ISO Latin-1" in article titles - I'm asking about whether we should use those diacritic marks that are included in that character set.
On the one hand, there seems to be an argument that we should file the articles without the diacritic marks, since i) this is the English Wikipedia, and ii) most English speakers are oblivious to diacritic marks, and usually don't know how to type them.
On the other hand, the policy here for transliterations is that unless an Anglicized form is in common use in the English-speaking world, use the straight transliteration. So extending this would say that for things which are commonly written in the English-speaking world without the diacritic marks, write them without, and for things which aren't well known, use the diacritics.
I was originally of the first school (actually, originally originally I was of the second school, feeling we should always file articles under the full correct names, since redirects are always available, but I got borged into the policy here), but am now moving towards the second one. What does everyone else think?
Of course, one should always add a redirect from the other form (to prevent duplicate article prevention and aid linking, if nothing else).
Anyway, please speak up, and let's work something out and write it down! Noel (talk) 11:41, 16 Nov 2004 (UTC)
This issue has been debated elsewhere, for instance on Wikipedia:Village pump (policy). Perhaps it ought to be given its own discussion page, as it will no doubt come up again. / Uppland 14:37, 19 Nov 2004 (UTC)
I think there is a widespread and common Wikipedia practice, even if not written, to leave characters in article names which are present in character set "ISO Latin-1".
See these examples of existing Wikipedia titles from various European languages:
I don't really see why it is a question. It is sure, however, that it should be written down explicitly because there is a minority among names (up to 5% to my experience) which don't follow this practice.
-- Adam78 23:08, 19 Nov 2004 (UTC)
For Slovenian, Croatian, Bosnian and Serbian names (these peoples use languages that are written in modified Latin), it has been a fairly common practice to use the name without the Latin2 diacritics in the article title on en:, but use the same diacritics in the article text. Occasionally the same issue applies to Macedonian and Bulgarian names, which are originally Cyrillic but can use Latin2 to transliterate. The letters š/Š (š/Š) and ž/Ž (ž/Ž) are causing some confusion because Latin1 includes them (cf. titles and redirects at Miroslav Krleža or Vinko Žganec), but that's two out of three that are missing so they're generally avoided. The "wrongtitle" template has brought some more attention to the issue, but I don't think it should be used for this purpose. -- Joy [shallot] 22:43, 18 Jan 2005 (UTC)
After further discussion at Wikipedia:Naming conventions (technical restrictions)#charset issues, I created Template:Titlelacksdiacritics. Please follow up there, and at Template talk:Titlelacksdiacritics. -- Joy [shallot]
Since the Mediawiki 1.5 upgrade also happened to introduce UTF8 support, this can be put ad acta. We've been converting the vast majority of pages using {{ titlelacksdiacritics}} to the proper title and dropping the template. -- Joy [shallot] 29 June 2005 17:29 (UTC)
See earlier discussion: Wikipedia talk:Naming conventions (use English)/diacritics
The project page guidelines are fine. this is about the "less clearcut" cases.
See Transliteration for background on the concepts transliteratino vs. transcription.
Convention: Name your pages in English and place the native transliteration on the first line of the article unless the native form is commonly used in English.
Please address specific points of the proposal (The "Zürich" case is the most borderline/controversial imho). dab 15:12, 20 Nov 2004 (UTC)
I would prefer:
As I see it, the critical issue is whether a policy should encourage or disencourage usage of Latin-1 characters. I could live with both, but this might be the good moment to express this a tad more clearly.
-- Johan Magnus 16:59, 20 Nov 2004 (UTC)
I would prefer:
-- Philip Baird Shearer 14:30, 21 Nov 2004 (UTC)
I don't understand:
Inside articles, UTF works fine (we're only restricted to Latin-1 in article names), so there's no reason not to give the full proper name on the first line (as you suggest), in whatever script. Yes, it should also give a Romanization (which may use characters which are not in the Latin-1 set - a lot of Japanese names are like that, with the macrons on o and u, not available in Latin-1), so people know how to say it.
However, this is all somewhat afield from article titles, which is what I enquired about. When it comes to titles, I think "universally" is a bit much - if the preponderant use is in either direction (with or without diacrits), then go that way, I would say.
Noel (talk) 02:26, 22 Nov 2004 (UTC)
This is about article titles. Since there will be wikilinks in the article texts, it is also about article texts (piped links notwithstanding, it will be most common practice to just link to the article title). I would finally like PBS to come clear on his opinion: You would like to avoid all non-ascii titles, for any article type? I do believe there will be a strong consensus against this, but can we vote on this, just to get it out of the way (or, if it is decided to go this way, we can tell the software people that the UTF-8 transition is not required)? This will mean we will not only have Zurich, Catalhuyuk, but also Kurt Godel, Bekesy Gyorgy, Albrecht Durer, Neue Zurcher Zeitung and so on. Note that this would be a major change compared to the de-facto practice on WP. The de-facto practice on WP is that we give the name in its native spelling if possible in Latin-1, with a redirect that is stripped of any diacritics.
PBS, I will not reply to any argument involving (a) search engines or (b) keyboads or (c) grammar again. It's simply not the issue. I have pointed out very often that these are entirely beside the point without your showing the slightest inclination towards even recognizing the existence of my objections.
Logically, we will proceed as follows:
can we get points 1-2 out of the way now, please, and then either scrap this proposal and go to a (imho, archaic 1970s-style) "ascii only" policy, or decide to make use of 21st century's UTF and concentrate on the (granted, difficult) policy discussion of how to handle them? dab 09:51, 22 Nov 2004 (UTC)
for your reference: ASCII; Diacritic; ISO 8859-1; UTF-8 (if I'm using a term you don't understand, look it up rather than accusing me of 'rabbiting on' or obscurantism). It seems that we do not entirely disagree. It is rather a question of "nearly universal" vs. "preponderance". I agree that diacritic-less spellings that are frequently found in reputable publications should be used. I do not think that google hits should be decisive. Ok. let's find some common ground. Do you agree that personal names should be treated differently (prefer native spelling) from geographical names? We may need guidelines for cases where other encyclopedias can be shown to support either spelling (as researched by Jallan on Talk:Zurich). I agree that we could go either way in the Zurich case, but I insist that Pāṇini (personal name) is useful. dab 15:20, 22 Nov 2004 (UTC)
KEEP IT SIMPLE use the same rule for people and places:
Apart from anything else one would end up with inconsistencies like Baron "X von Zürich" born in Zurich which is an unnecessary complication.
BTW To make a point, rabbit means talk, any inner-city London kid could tell you that, as could most other children in England, but they might not know about diacritics! Keep the rules as simple as possible. Philip Baird Shearer 18:11, 22 Nov 2004 (UTC)
I agree with Philip Baird Shearer. I'm sick of being told that writing "Zurich" instead of "Zürich" is "lazy" or "unacademic" - I write it that way because that's its English name, and I have both common usage and the Oxford Manual of Style to back me up on that. I see no reason to make an exception to the "use English" rule on an English-language encyclopaedia just for diacritics, which are just as much an inter-language variable as spelling and pronunciation. Proteus (Talk) 13:25, 23 Jan 2005 (UTC)
Wikipedia:Naming conventions (use English) are fine with just about every non-geographical or non-personal stuff - for example, no-one should be bothered to list a German or Russian spelling in Aluminum, Automobile etc. But I'm fine when a name or place 1) contains accented/diacritic Latin symbols, as long as the language is from Indo-European family 2) is transliterated in accordance to modern phonetical-based guidances, even when they conflict with commonly accepteed names, unless there are specific guidances, as for monarchial/noble titles. Both Hermann Göring and Moskva, mentioned in other sections, are the perfect examples.
I personally see no point of keeping Moscow spelling, which originates from ancient name of grad Moskov and was never used in Russian language since the first England embassy under Alexey Mikhailovich. I don't expect the article to be moved under Moskva right now, although I believe the common spelling would gradually change to a phonetically correct one and so will the title. Witness the power of redirects - everyone can still hyperlink to Moscow in their articles, but when someone clicks it he arrives at Moskva.
I also don't have a problem with accented German, French etc. characters. There's little phonetical difference between Göring, Goering or Goring, so I don't see why we shouldn't use national name. As an example, some articles after Russian persons with a commonly accepted name, such as Khruschev, have already been transliterated according to the Wikipedia guidances with pre-existing versions listed at the beginning, and I've yet to see any problems. Witness the power of redirects again - the searches or hyperlinks by common name leads to a properly transliterated name, no pain!
Similarily, I hate hearing Churchill's name spelled as Chyerchile (Черчиль) in Russian, while Chyorchil (Чёрчил) would be phonetically correct; Hudson spelled as Goodzone (Гудзон) instead of Khadson (Хадсон); Paris spelled as Paridge (Париж) instead pf Paree etc.
To summarize, I'm all for phonetical transcriptions in both personal and geographic names (with IPA notation of native name) and the use of ISO 8859-1 in titles, but redirects from common English names so that they could be properly linked. GOD BLESS REDIRECTS!!! -- DmitryKo 22:48, 12 Mar 2005 (UTC)
I'm not that bothered, but my hackles rise when someone claims that Wikipedia should strive for "correctness". Take the recent Göring → Goering move suggestion. I'm in two minds about this. Usually I think we should just leave the article where it was created, because quite often both the native spelling and the anglicised spelling are common in English speaking text. Here however you have a name that is almost universally spelled as Goering by English speakers. In that case my feeling is that we should put it where the English speaker expects to find it and keep a redirect for people who know, and will use, the native spelling. To say that the native spelling is "correct" in English is simply wrong. And most English language keyboards and browsers don't make the accents and diacritics easily accessible and English speakers cannot learn the accenting conventions of every language in which names of people and places are likely to be presented, so it really isn't a good idea to keep accented names except where the accented version is used almost universally by English speakers (Mallarmé and Fauré, for instance, may be examples where the accented form is more common in English). -- Tony Sidaway| Talk 13:51, 21 Jan 2005 (UTC)
Thus Göring is outnumbered more than 6 to 1 by Goering on English pages according to Google. -- Tony Sidaway| Talk 14:40, 21 Jan 2005 (UTC)
Funny, Google gives me "about 293,000 English pages for Göring" and "about 292,000 English pages for Goering". Yes, I know Google works differently in the USA and UK. But English is an official language here. Why do people keep claiming that Google search results prove something, but they rely on the way Google works specifically in two or three countries, and reject results in the other 150 or so? I'd like to see the policy state that for use in resolving disputes, only Google results in the USA are valid. Or else drop the whole pretence that they prove anything anyway. — Michael Z. 2005-01-21 17:12Z
Michael what you are saying lets set up a search that does not differentiate between the two words and then say that they have the same usage! Using the Google as set up in Australia, New Zealand, South Africa, and the UK (and probably other English speaking countries) they all differentiate, and all show that the the difference in the word usage which is about five or six to one. Secondly, dab,you and I disagree about correctness (see above) but in the past you have said that authoritative refrences should he taken into account, a primary source for much of what is written uses Goering, the references at the end of the article also are 2 to one in favour of Goering with the decenter being David Irving who is hardly a credible source, so I assume you agree with using Goering. I would put the article under Herman Goering with a first line of:
If Goering was not the most popular entry under English then I would put the article under Herman Goring
-- Philip Baird Shearer 18:15, 21 Jan 2005 (UTC)
Nobody's talking about endorsing any one method. The fact is that Goering is by about six-to-one the most common spelling in this case, however, and this is easily shown by using tools such as Google. While the tool isn't endorsed, this doesn't mean we ignore the facts just because Google is a good method for demonstrating them. If instead I go to my newspaper site, The Guardian, I find hundreds of Goerings and one Göring (from an article written by an actor who played the role in a play in which the German spelling was used). -- Tony Sidaway| Talk 19:16, 22 Jan 2005 (UTC)
[Moving this here from the other Göring discussion. / Uppland 20:04, 21 Jan 2005 (UTC)]
I would like to suggest a partial compromise which would work in anglicizing some names: I suggest that in those cases where a person has actually lived and worked in an English-speaking country and can be shown to have regularly used an English form of his or her name (i.e. not the occasional use inhibited by an English typewriter), the English form of the name would be fine. This has no relevance for the Göring article, but would solve the Masaryk article as T. G. Masaryk actually lived in England for some time. I checked the Times Digital Archive and got one hit for Tomas Masaryk, one for Tomás G. Masaryk, but 18 for Thomas Masaryk which is a more fully anglicized form, including the obituary from 1937. (There are more hits for Masaryk: "President Masaryk", "Professor Masaryk" etc.). Interestingly enough, the obituary for Eduard Beneš in 1948 spells Beneš in the Czech orthography (not plain Benes), but mentions Masaryk with the English Thomas.
I don't think we should expect any consistent application of Czech orthography from the typographers at The Times, but the last example shows that English press did not consistently strip diacritics from foreign names (the article also mentions a couple of accented Czech place-names), but also suggests that at least The Times at least on this occasion seems to have made the distinction I am proposing here between a person who had himself used his name in an English form and somebody who had not. It also shows that Tomas Masaryk, without diacritics, is really the worst alternative, being neither Czech nor English. / Uppland 18:34, 20 Jan 2005 (UTC)
I don't see how these guidelines aren't followed for Tucson... the only thing that seems to be missing is a redirect at Cuk Ṣon. I'll leave it to Node ue to make one :) -- Joy [shallot]
It's time to discard this policy. We say "Name your pages in English and place the native transliteration on the first line of the article unless the native form is more commonly used in English than the anglicized form", but this is just not being followed. Almost every time an argument comes up over it, the policy loses out.
Look at Wurttemberg, Riksdag, Goering, Tweede Kamer, Zurich (that's one's particularly ludicrous - walk up to the average person on the street in Auckland or New York or Sydney or London or Toronto etc and ask them to write "Zurich" on a piece of paper [so keyboards don't come into it] and they'll write "Zurich", not "Zürich"), etc, etc, etc, etc, etc. (The latest one is Spion Kop.)
I'm tired of getting in arguments with people, trying to apply this policy, only to have it ignored. It wouldn't be so upsetting if we just said "use the local name, with a redirect from the common English version, and mention the English version in the opening sentence", I would quite happily go along with that.
But it's really trying to have the policy say one thing, and then do something different in practise. It's time to document reality, which is that every page is done on a case-by-case basis, and it doesn't matter what name is most common in the English-speaking world. Unless people are prepared to actually follow this policy, and rename Zurich and all the rest, I'm going to change the page to give the policy actually followed in practise, which is "no uniform policy".
And no, this is not "disrupting Wikipedia to make a point". I am dead serious about changing this policy, because it's not what we are doing. Noel (talk) 19:31, 1 Feb 2005 (UTC)
You appear to be replying to Proteus, but I would very much like to hear your response to my original point.
The page says "use the most commonly used English version of the name for the article" (my emphasis). I don't think there's much question that the average person-in-the-street in the English-speaking world would spell it "Zurich", not "Zürich", but the article is at "Zürich", nonetheless (after lengthy debate, too). So clearly the policy described in the page does not describe the actual current state of affairs. I think this page should document our actual policy. Therefore, it needs to be changed to say something like "it's decided on a page-by-page basis".
(BTW, the Bombay page is now at Mumbai. Rather ironic, therefore, that you should have picked that as an example of a page to name using the "English name"....)
I care far less what policy we pick than that we pick one, document it accurately, and stick with it. Noel (talk) 22:20, 1 Feb 2005 (UTC)
If you read the rest of the sentence quoted above, it says "unless the native form is more commonly used in English than the anglicized form". That is a good policy, and I think it's usually followed. "Mumbai" follows it pretty well; Mumbai is now the normal name of the city (for example, on the BBC website), a change which I can remember happening in the last few years. It's a good policy and we should stick to it. (Mumbai is now no longer an example on this page). DJ Clayworth 22:35, 1 Feb 2005 (UTC)
Have you read anything I have written above about "keep it simple" and the reason for doing so? It is not incorrect to drop funny foreign squiggles when writing a word in English. The examples you give are just as correct if the are spelt as Attache, Naive, Encyclopaedia. An example I used before was El Nino which used to show up because one of the external links had the word "el-nino" as part of its link name. Now it does not, so that page will not be found by the many people in many English speaking countries. What is the point of producing a pages which limit access to them by English speaking people because some wikipeodia editors like funny foreign squiggles? Why not use the format for such pages as:
Using this format would fulfil both ease of access for most English people and an educational remit of an Encyclopaedia. At the moment there is often nothing to indicate why funny foreign squiggles are appearing on a word. If such a format was used then at least someone not familar with the word would learn that it was German, Spanish, French etc, which is not something most pages do at the moment.
At the very least people who like funny foreign squiggles on words, could respect the primary author and leave words alone if the appear without them.
I wonder why is it that there is a debate on umlauts on Zurich in the en.wikipedia and not in fr.wikipedia?
I do not think it is time to discard the policy but to strengthen and to modify it so that both usages are included. See my 5 point suggestion under "Keep it simple" in the #Comments above. Philip Baird Shearer 03:30, 3 Feb 2005 (UTC)
I'm tired too, Jnc. The reason we're not making progress is that people still refuse to make the distinction between transliteration and language. Dropping diacritics is a question of (English) orthography, not one of the English language Zurich is an English orthography for Zürich, but that doesn't make Zurich an Anglo-Saxon name.
Ionian Islands on the other hand is the English name for Ionia Nesia, which is the "English" diacritic-less orthography of the transliteration of Greek Iónia Nēsiá, in the Greek alphabet Ιόνια Νησιά. Because there is an English name, in this case, we do not use Ionia Nesia. This (and nothing else) is what we mean by "use English".
Can we at least agree to separate these cases, and argue about two unrelated policies, one about English, and the other about diacritics. I just refuse continue to respond to comments that argue diacritics are un-English, and therefore "no diacritics" equals "English". So, no, Zürich is no breach of this policy, and "El Nino" is not "English for El Niño". dab (ᛏ) 15:31, 3 Feb 2005 (UTC)
I agree that the policy should be changed into 'Use Latin names and redirect from common English names' for both personal and geographic names in foreign-related topics. Naming an article in accordance with local phonetics wouldn't stop anyone from using its English counterpart for both referencing and hyperlinking, but it should provide a better insight into the culture of a particular region. Wiki is not paper and not an ASCII-based database of the past. I can't see why a (misspelled) anglicized name is better than a native name, because there are Mighty Redirects™.
A single potential trouble with diacritical marks is that every Unicode symbol can be combined with diacritic post-nominal codes, #769 to #869. This means that acute accent letters Á and Á look the same, but the corresponding Unicode character strings are different - in the first case, it's a single character Á but the second is a two-character combination of Á. The Wiki software should be updated to convert such double-character cases into a single-character symbol and also account for these differences when searching. Besides that, I don't see any other technical problem with diacritic or national symbols in article names, as long as they are Latin-based. -- DmitryKo 11:13, 13 Mar 2005 (UTC)
Does this and italso :
This last one is critical because when someone move the article to a name like El Niño they also change every instance of the name to that version, and if the first person does not a later editor does. This means that the page in the English Wikipedia does not show up in a search unless the search engine wraps "O" "OE" and Ö which not all English search engines do. (EG Google.co.nz or Google.co.uk). People who are not English of have a good grasp of English need to understand that for the vast majority of native English speaking people, diacritics are meaningless and are not used when searching for a word.
Using redirects for "El Nino" to a page called "El Niño" is not sufficient because the redirect "El Nino" does not show up in a search external to Wikipedia. For this to happen the text must be embedded in the page. Philip Baird Shearer 15:53, 13 Mar 2005 (UTC)
Which Google are you using? http://www.google.com.au http://www.google.co.nz http://www.google.co.uk http://www.google.co.za all work the same way, they differentiate on diacritics eg a search on "El Nino" and "El Niño" return diffrent pages. http://www.google.ie http://www.google.ca seem to be set up as bylingual (one of which uses diaeresis) Germany http://www.google.de returns similar results to google.ca and google.ie. So it is a perceived cultural diffrence by Google not a technical one. Using google.co.uk:
With +wikipidia the "El Niño" page is only picked up because of the HTTP link in the external link "NOAA explanation" includes the string "el-nino-story". If I use another one like "Second battle of Zurich" UK Google does not pick up the main Wikipedia page (because all instences of Zurich have been changed to Zürich):
-- Philip Baird Shearer 02:36, 14 Mar 2005 (UTC)
El Niño | El Nino | |
---|---|---|
non diacritic-aware | ||
Web | 3,380,000 | 1,640,000 |
English | 1,240,000 | 1,640,000 |
en.wikipedia.org | 51 | 26 |
diacritic-aware | ||
Web | 3,000,000 | 2,970,000 |
English | 2,180,000 | 2,150,000 |
en.wikipedia.org | 71 | 71 |
Battle of Zurich | Battle of Zürich | |
non diacritic-aware | ||
Web | 212,000 | 61,500 |
English | 206,000 | 27,900 |
en.wikipedia.org | 29 | 18 |
diacritic-aware | ||
Web | 212,000 | 66,100 |
English | 206,000 | 29,900 |
en.wikipedia.org | 23 | 13 |
Just try using http://www.google.co.uk don't set any defaults because most users will not do so. The run the test.
The trouble with adding the common English names is that you have to do it and the chances are that pedants will remove them again. Arguing that it is not the most common English version see the Zurich talk page. Besides if it is the most common English version why not use it for the Page name?
If we could agree that the format should be El Nino ( Spanish: El Niño) the problem would be "fixed". Your format does almost the same but it does not include the additional information that it is a Spanish word and that there is an article in the native language for those who are interested: See Battle of Hurtgen Forest for an example. Philip Baird Shearer 13:47, 14 Mar 2005 (UTC)
This is supposed to be an ENGLISH encyclopedia not an International one. Diacritic may have originated as a Greek word but it is used in English as an English word. If you were to read the diacritic page you would see that there are no diacritics in modern English unless it is on a borrowed word which had not been fully integrated in to English. One of the steps of integration is to strip off the diacritics. So please explain how That was dispelled about five minutes after you'd first brought it up.
It is not a technical issue it is a cultural one. As Noel said in the second paragraph of this article walk up to the average person on the street in Auckland or New York or Sydney or London or Toronto etc and ask them to write "Zurich" on a piece of paper [so keyboards don't come into it] and they'll write "Zurich", not "Zürich". Ask then to spell El Nino and very few would write El Niño. Name your pages in English and place the native transliteration on the first line of the article unless the native form is more commonly used in English than the anglicized form
If the words are first spelt without diacritics and then explained in brackets where the borrowed word comes from and alternative spellings after that, then if it is used with or without diacritics throughout the rest of the article it has been defined at the top and is self referencing. Several other things things have been achieved.
"The vast majority of English speakers would be able to find the word or phrase": this is the "argument" I'm referring to as "dispelled after five minutes". finding an article is a technical question, not a cultural one, and the problem is solved via redirects, so let's not bring that up again. Let's also not get wound up with the Zurich case, which has become the prototypical borderline case: Yes I agree it could be at Zurich, both solutions have been convincingly argued, including use of diacritics in major English language publications. Both would be right, ok? I'm not objecting to Zurich. What I'm objecting to is your apparent insistence that it is not possible to write about non-English names in the English language. Get it? names, not ordinary words. Philip is not an English name. Æðelflæð and Byrhtnoþ, on the other hand, are , very much, English names. After all this time, you still confuse orthography with language? this policy is not about diacritics, it is about cases Giovanni vs. John, or Roma vs. Rome, so why don't we just argue about these? As for "walking up to people in the street and asking them to spell something" are you serious?? So why am I spending time fixing factual errors in articles if what we're aiming at is not the correct information, but the most widespread misconception?
dab
(ᛏ) 12:14, 15 Mar 2005 (UTC)
Before the system went into readonly mode yesterday, I had typed out a long reply to Dbachmann's last post. But on reflection (apart from stressing, I was talking about using search engines like "Ask Jeeves" 'outside Wikipedia) I am just repeating things which have already been written higher up the page. So I suggest that Dabchmann and I give it a rest and let some others who are interested in the subject contribute. Philip Baird Shearer 13:48, 17 Mar 2005 (UTC)
I have nothing against having the word with diacritic on the first line of an article if some people spell it that way providing that there is also a diacritic free version of the word or phrase. Indeed I would like to change this guide line to recommend that both version must appear on the first line of an article for words or phrases which can be spelt with diacritics. What I object to is the removal of all references to a diacritic free version of a word in the text of a page because a word with diacritics is "correct" and one without is "incorrect". If the problem were like AE v. CE English then it would be solved by primary author, but with numerous cases copy-editors ignore the initial diacritic free version and change it to include a diacritic version. (and I am sure that some copy-editors do it the other way around, but I have not seen any articles so modified as they have stayed around for long). Philip Baird Shearer 15:02, 31 Mar 2005 (UTC)
That last edit implies that "English name" and "Anglicized spelling" are synonyms. Confusing, because that's not necessarily correct.
The repetition of two similar but subtly different conventions is also confusing. E.g., are titling and naming a page the same or different? — Michael Z. 2005-04-7 14:32 Z
"Take two" is now in place. Lets discuss that as it is an attempt to put in place what there before the recent changes which were not agreed upon before they were made on this talk page. Philip Baird Shearer 16:39, 7 Apr 2005 (UTC)
This page is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | Archive 2 | Archive 3 | → | Archive 5 |
About (for German umlauts -> letter+e; for French, dropped accents): what does that mean ? French accents are perfectly acceptable and accepted in a title, for all I know (e.g. Coup d'état). I don't know about umlauts, but shouldn't at least the affirmation of french accents be removed ? --FvdP 22:16 Feb 19, 2003 (UTC)
Moved from Wikipedia:Village pump:
Maybe we should update the somewhat off topic note in Wikipedia:Naming conventions (use English) "Use Latin-1 (ISO 8859-1) for the title of an article. Note not UTF-8 nor 7 bit ascii. While Western European accents are acceptable, non-Western European accents need to be dropped. (for German umlauts -> letter+e; for French, dropped accents). (Based on the post from Brion)" Docu 07:12 Feb 28, 2003 (UTC)
Current pratice (discussion now moved to Wikipedia talk:Special characters) seems to be to use Latin-1 spellings for the articles and redirect from basic-ascii spellings, i.e. (for German umlauts -> letter+e; for French, dropped accents).
As not everybody has necessarily all characters on their keyboard, I suppose it's ok to create articles with Basic ASCII and have someelse move them later. Wikipedia:Special characters should answer these questions. That article needs updating though, as it was written when the Wikipedia software couldn't handle accents at all Docu 12:37 Mar 2, 2003 (UTC)
Not using accents is crazy in some contexts. In the Irish language, the very word's meaning and pronounciation is created using an accent (usually a fada, as in é (pronounced e fada). Drop the fada and you change the word, meaning and pronounciation. It is ok to create a fada-less redirect, but under no circumstances could you possibly put the main article under what would be a wrongly spelled word. For example, Ireland's fifth president is Cearbhall Ó Dálaigh. Dropping the fadas and you might as well put him in as Karol Odawle for all the sense it would make. A redirect from Cearbhall O'Dalaigh would be OK, but the main article would have to be spelt in the correct manner.
JtdIrL 03:47 Mar 3, 2003 (UTC)
I find it somewhat inconsistent. E.g. Some insist that Danzig should be used instead of Polish Gdansk whenever majority of population was German or when city was part of German states. OTOH, trhoughout the encyclopedia only Vilnius is used instead of Wilno. The same is with L'viv and Lviv used almost consitently instead of Lwow or Lwów. Did that means that policy is to use German names whenever possible and local names in other cases????!? szopen
FWIW, this isn't only a Polish issue either. Ukrainians want to use Kyiv as the name of the city usually known in English as Kiev, considering the latter, as a transliteration from Russian, to be an offensive relic of Russian imperialism. However, Kiev is by far the more common English name, so there is disagreement over whether "common, but possibly offensive" or "official, but very uncommon" should take precedence. See also Talk:Kolkata for previous discussion of Calcutta vs. Kolkata and related. -- Delirium 23:57, Nov 4, 2003 (UTC)
Any tips on naming conventions for non-English organisations? There is temptation by some to translate them where familiar words are used, not just transliterate. Compare Parliament of Sweden with Tweede Kamer or consider Partij van de Arbeid vs alternatives such as "Labour Party (Netherlands)" "Labour Party of the Netherlands", or Front National vs "National Front". Consider also Nederlands Instituut voor Oorlogsdocumentatie vs a translated "Netherlands Institute for War Documentation". ( 20:19, 16 Dec 2003 (UTC)
I'd like to propose a change. We are supposed to use the commonest form in English. How do we know which is the commonest? The Wiki method seems to be to use a crude Google search. Shouldn't we use the form most commonly used by intelligent, knowledgeable, literate speakers of English? Plenty of morons say "Milano" (mangling the pronunciation of course) but the English name is Milan, for example.
Secondly, some Wikipedians will accept the English name for a city, but not for the surrounding area directly named after it. E.g. Seville and its province. I see no logic in this whatsoever. Don't we need an express policy on this? — Chameleon 01:54, 12 May 2004 (UTC)
I found that more and more foreign-made words are being poured into English WP as entries. I don't know if these words make sense to English speakers, and I am wondering if it's good or bad for the developement of English WP? -- Yacht (talk) 10:17, Jul 17, 2004 (UTC)
I can't give you the accurate examples, because English is not my mother tongue, i just come across some words i don't think they are normal English words, or foreign-like words (like Führer, Fribytaren på Östersjön, Yuri, Yuzu etc.) I don't know if they are already widely used in English, or just neologies (aren't there any corresponding English words for them? I don't even know how to read them). I am just worrying if this may happen: every language creates the corresponding synonym entry for that in English.-- Yacht (talk) 16:16, Jul 17, 2004 (UTC)
You could make a case that a large part of English stolen foreign words! 'I', 'Found', 'word' all come to Old English from proto Germanic roots, 'foreign' and 'pour' are from old French, 'made' comes to English from West Germanic, 'entry' is from Middle French. Wiki of course is Hawaiian, 'pedia' probably from Greek, misunderstood by Latin scholars. You were 'wondering', which comes to us through old English from proto Germanic, whether this affects the 'development' (a French word). Interestingly, noone seems to know where 'bad' comes from, but 'good' is another proto-Germanic word. I wouldn't let it keep you up at night. PS. Yacht is from Norwegian! Mark Richards 17:17, 17 Jul 2004 (UTC)
There's quite a well-known and oft-quoted saying: The problem with defending the purity of the English language is that English is about as pure as a cribhouse whore. We don't just borrow words; on occasion, English has pursued other languages down alleyways to beat them unconscious and rifle their pockets for new vocabulary. [2]. Führer is a commonly understood word from the events of the 1930s and 40s; the second example is the title of a Swedish book which I'm unfamiliar with, though the article gives a translation; Yuri I've not come across, while Yuzu apparently is a Japanese fruit, so it's not surprising that there isn't an English name for it. I thought Yacht came from Dutch! [3] -- Arwel 18:33, 17 Jul 2004 (UTC)
the purity of English is one thing, while what i am concerning is another. I hope people don't think English WP is a romanization version of their own languages, and create thousands of entries in their romanized languages that should be in English (just like i created an entry Nie Zi for Crystal Boys, but later, i know there is an English name for it, so i deleted the Nie Zi which may make no sense to English speakers). About Yacht, i think it's a German word, and that's why i chose that. :) -- Yacht (talk) 18:03, Jul 19, 2004 (UTC)
The problem is when there is no anglicisation of a term, the anglicisation is really bad, or the anglicisation has fallen out of favour (for political/historical/cultural reasons). Also, there may be two, or a number of competing anglicisations - which may lead to a preference for the most locally appropriate term.
All in all, you'll have to put up with, and expect, some non-English terms. Besides, as noted above, there's plenty of non-English terms have been absorbed into our vocabulary. ( Coup d'état isn't exactly English!)
zoney ███ talk 19:40, 23 Aug 2004 (UTC)
(I have posted a nearly similar paragraph on Talk:List_of_colleges_and_universities -please tell me politely that this double posting is wrong if it is ; I shall learn through my mistakes)
I had thought of editing some pages related to French universities and institutions, and I feel that some coherence should first be given to the choice of name to be used for each of these.
Reading this discussion page (notably the paragraph about parliaments or political parties) I am rather more puzzled than I could be before.
If I browse through various pro country university lists, I can only discover that there has been no common policy as what to do with university names, e.g.
The page List_of_colleges_and_universities_starting_with_U is a special mess, with its mix of translated names (e.g. "university of Modena") and native names (e.g. "universita degli Studi di Pavia").
I think that some more precise instructions should be precious to future editors. -- French Tourist 21:54, 30 Aug 2004 (UTC)
Many languages can be transliterated ( romanized) by more than one system (e.g. pinyin, Wade-Giles, and others for Chinese). Is this article a good place to keep track of the convention that are used on Wikipedia?
Is it understood that some names that have an established spelling may not be transliterated using the chosen convention? Do any languages use more than one system? With the impending adoption of UTF-8, is it better to just use the native text and IPA?
I'll start a list here from what I know; please add any systems that seem to be commonly accepted.
— Michael Z. 23:53, 2004 Sep 20 (UTC)
Hi, I'm trying to work out what standard policy should be towards diacritic marks in article titles. There doesn't seem to be a standard policy at the moment - or if there is, this page doesn't really give any clear guidance. I do see some discussion above here, but I'm not clear if rough consensus was reached.
I am aware that Wikipedia:Naming conventions (technical restrictions) says we are restricted to using "ISO Latin-1" in article titles - I'm asking about whether we should use those diacritic marks that are included in that character set.
On the one hand, there seems to be an argument that we should file the articles without the diacritic marks, since i) this is the English Wikipedia, and ii) most English speakers are oblivious to diacritic marks, and usually don't know how to type them.
On the other hand, the policy here for transliterations is that unless an Anglicized form is in common use in the English-speaking world, use the straight transliteration. So extending this would say that for things which are commonly written in the English-speaking world without the diacritic marks, write them without, and for things which aren't well known, use the diacritics.
I was originally of the first school (actually, originally originally I was of the second school, feeling we should always file articles under the full correct names, since redirects are always available, but I got borged into the policy here), but am now moving towards the second one. What does everyone else think?
Of course, one should always add a redirect from the other form (to prevent duplicate article prevention and aid linking, if nothing else).
Anyway, please speak up, and let's work something out and write it down! Noel (talk) 11:41, 16 Nov 2004 (UTC)
This issue has been debated elsewhere, for instance on Wikipedia:Village pump (policy). Perhaps it ought to be given its own discussion page, as it will no doubt come up again. / Uppland 14:37, 19 Nov 2004 (UTC)
I think there is a widespread and common Wikipedia practice, even if not written, to leave characters in article names which are present in character set "ISO Latin-1".
See these examples of existing Wikipedia titles from various European languages:
I don't really see why it is a question. It is sure, however, that it should be written down explicitly because there is a minority among names (up to 5% to my experience) which don't follow this practice.
-- Adam78 23:08, 19 Nov 2004 (UTC)
For Slovenian, Croatian, Bosnian and Serbian names (these peoples use languages that are written in modified Latin), it has been a fairly common practice to use the name without the Latin2 diacritics in the article title on en:, but use the same diacritics in the article text. Occasionally the same issue applies to Macedonian and Bulgarian names, which are originally Cyrillic but can use Latin2 to transliterate. The letters š/Š (š/Š) and ž/Ž (ž/Ž) are causing some confusion because Latin1 includes them (cf. titles and redirects at Miroslav Krleža or Vinko Žganec), but that's two out of three that are missing so they're generally avoided. The "wrongtitle" template has brought some more attention to the issue, but I don't think it should be used for this purpose. -- Joy [shallot] 22:43, 18 Jan 2005 (UTC)
After further discussion at Wikipedia:Naming conventions (technical restrictions)#charset issues, I created Template:Titlelacksdiacritics. Please follow up there, and at Template talk:Titlelacksdiacritics. -- Joy [shallot]
Since the Mediawiki 1.5 upgrade also happened to introduce UTF8 support, this can be put ad acta. We've been converting the vast majority of pages using {{ titlelacksdiacritics}} to the proper title and dropping the template. -- Joy [shallot] 29 June 2005 17:29 (UTC)
See earlier discussion: Wikipedia talk:Naming conventions (use English)/diacritics
The project page guidelines are fine. this is about the "less clearcut" cases.
See Transliteration for background on the concepts transliteratino vs. transcription.
Convention: Name your pages in English and place the native transliteration on the first line of the article unless the native form is commonly used in English.
Please address specific points of the proposal (The "Zürich" case is the most borderline/controversial imho). dab 15:12, 20 Nov 2004 (UTC)
I would prefer:
As I see it, the critical issue is whether a policy should encourage or disencourage usage of Latin-1 characters. I could live with both, but this might be the good moment to express this a tad more clearly.
-- Johan Magnus 16:59, 20 Nov 2004 (UTC)
I would prefer:
-- Philip Baird Shearer 14:30, 21 Nov 2004 (UTC)
I don't understand:
Inside articles, UTF works fine (we're only restricted to Latin-1 in article names), so there's no reason not to give the full proper name on the first line (as you suggest), in whatever script. Yes, it should also give a Romanization (which may use characters which are not in the Latin-1 set - a lot of Japanese names are like that, with the macrons on o and u, not available in Latin-1), so people know how to say it.
However, this is all somewhat afield from article titles, which is what I enquired about. When it comes to titles, I think "universally" is a bit much - if the preponderant use is in either direction (with or without diacrits), then go that way, I would say.
Noel (talk) 02:26, 22 Nov 2004 (UTC)
This is about article titles. Since there will be wikilinks in the article texts, it is also about article texts (piped links notwithstanding, it will be most common practice to just link to the article title). I would finally like PBS to come clear on his opinion: You would like to avoid all non-ascii titles, for any article type? I do believe there will be a strong consensus against this, but can we vote on this, just to get it out of the way (or, if it is decided to go this way, we can tell the software people that the UTF-8 transition is not required)? This will mean we will not only have Zurich, Catalhuyuk, but also Kurt Godel, Bekesy Gyorgy, Albrecht Durer, Neue Zurcher Zeitung and so on. Note that this would be a major change compared to the de-facto practice on WP. The de-facto practice on WP is that we give the name in its native spelling if possible in Latin-1, with a redirect that is stripped of any diacritics.
PBS, I will not reply to any argument involving (a) search engines or (b) keyboads or (c) grammar again. It's simply not the issue. I have pointed out very often that these are entirely beside the point without your showing the slightest inclination towards even recognizing the existence of my objections.
Logically, we will proceed as follows:
can we get points 1-2 out of the way now, please, and then either scrap this proposal and go to a (imho, archaic 1970s-style) "ascii only" policy, or decide to make use of 21st century's UTF and concentrate on the (granted, difficult) policy discussion of how to handle them? dab 09:51, 22 Nov 2004 (UTC)
for your reference: ASCII; Diacritic; ISO 8859-1; UTF-8 (if I'm using a term you don't understand, look it up rather than accusing me of 'rabbiting on' or obscurantism). It seems that we do not entirely disagree. It is rather a question of "nearly universal" vs. "preponderance". I agree that diacritic-less spellings that are frequently found in reputable publications should be used. I do not think that google hits should be decisive. Ok. let's find some common ground. Do you agree that personal names should be treated differently (prefer native spelling) from geographical names? We may need guidelines for cases where other encyclopedias can be shown to support either spelling (as researched by Jallan on Talk:Zurich). I agree that we could go either way in the Zurich case, but I insist that Pāṇini (personal name) is useful. dab 15:20, 22 Nov 2004 (UTC)
KEEP IT SIMPLE use the same rule for people and places:
Apart from anything else one would end up with inconsistencies like Baron "X von Zürich" born in Zurich which is an unnecessary complication.
BTW To make a point, rabbit means talk, any inner-city London kid could tell you that, as could most other children in England, but they might not know about diacritics! Keep the rules as simple as possible. Philip Baird Shearer 18:11, 22 Nov 2004 (UTC)
I agree with Philip Baird Shearer. I'm sick of being told that writing "Zurich" instead of "Zürich" is "lazy" or "unacademic" - I write it that way because that's its English name, and I have both common usage and the Oxford Manual of Style to back me up on that. I see no reason to make an exception to the "use English" rule on an English-language encyclopaedia just for diacritics, which are just as much an inter-language variable as spelling and pronunciation. Proteus (Talk) 13:25, 23 Jan 2005 (UTC)
Wikipedia:Naming conventions (use English) are fine with just about every non-geographical or non-personal stuff - for example, no-one should be bothered to list a German or Russian spelling in Aluminum, Automobile etc. But I'm fine when a name or place 1) contains accented/diacritic Latin symbols, as long as the language is from Indo-European family 2) is transliterated in accordance to modern phonetical-based guidances, even when they conflict with commonly accepteed names, unless there are specific guidances, as for monarchial/noble titles. Both Hermann Göring and Moskva, mentioned in other sections, are the perfect examples.
I personally see no point of keeping Moscow spelling, which originates from ancient name of grad Moskov and was never used in Russian language since the first England embassy under Alexey Mikhailovich. I don't expect the article to be moved under Moskva right now, although I believe the common spelling would gradually change to a phonetically correct one and so will the title. Witness the power of redirects - everyone can still hyperlink to Moscow in their articles, but when someone clicks it he arrives at Moskva.
I also don't have a problem with accented German, French etc. characters. There's little phonetical difference between Göring, Goering or Goring, so I don't see why we shouldn't use national name. As an example, some articles after Russian persons with a commonly accepted name, such as Khruschev, have already been transliterated according to the Wikipedia guidances with pre-existing versions listed at the beginning, and I've yet to see any problems. Witness the power of redirects again - the searches or hyperlinks by common name leads to a properly transliterated name, no pain!
Similarily, I hate hearing Churchill's name spelled as Chyerchile (Черчиль) in Russian, while Chyorchil (Чёрчил) would be phonetically correct; Hudson spelled as Goodzone (Гудзон) instead of Khadson (Хадсон); Paris spelled as Paridge (Париж) instead pf Paree etc.
To summarize, I'm all for phonetical transcriptions in both personal and geographic names (with IPA notation of native name) and the use of ISO 8859-1 in titles, but redirects from common English names so that they could be properly linked. GOD BLESS REDIRECTS!!! -- DmitryKo 22:48, 12 Mar 2005 (UTC)
I'm not that bothered, but my hackles rise when someone claims that Wikipedia should strive for "correctness". Take the recent Göring → Goering move suggestion. I'm in two minds about this. Usually I think we should just leave the article where it was created, because quite often both the native spelling and the anglicised spelling are common in English speaking text. Here however you have a name that is almost universally spelled as Goering by English speakers. In that case my feeling is that we should put it where the English speaker expects to find it and keep a redirect for people who know, and will use, the native spelling. To say that the native spelling is "correct" in English is simply wrong. And most English language keyboards and browsers don't make the accents and diacritics easily accessible and English speakers cannot learn the accenting conventions of every language in which names of people and places are likely to be presented, so it really isn't a good idea to keep accented names except where the accented version is used almost universally by English speakers (Mallarmé and Fauré, for instance, may be examples where the accented form is more common in English). -- Tony Sidaway| Talk 13:51, 21 Jan 2005 (UTC)
Thus Göring is outnumbered more than 6 to 1 by Goering on English pages according to Google. -- Tony Sidaway| Talk 14:40, 21 Jan 2005 (UTC)
Funny, Google gives me "about 293,000 English pages for Göring" and "about 292,000 English pages for Goering". Yes, I know Google works differently in the USA and UK. But English is an official language here. Why do people keep claiming that Google search results prove something, but they rely on the way Google works specifically in two or three countries, and reject results in the other 150 or so? I'd like to see the policy state that for use in resolving disputes, only Google results in the USA are valid. Or else drop the whole pretence that they prove anything anyway. — Michael Z. 2005-01-21 17:12Z
Michael what you are saying lets set up a search that does not differentiate between the two words and then say that they have the same usage! Using the Google as set up in Australia, New Zealand, South Africa, and the UK (and probably other English speaking countries) they all differentiate, and all show that the the difference in the word usage which is about five or six to one. Secondly, dab,you and I disagree about correctness (see above) but in the past you have said that authoritative refrences should he taken into account, a primary source for much of what is written uses Goering, the references at the end of the article also are 2 to one in favour of Goering with the decenter being David Irving who is hardly a credible source, so I assume you agree with using Goering. I would put the article under Herman Goering with a first line of:
If Goering was not the most popular entry under English then I would put the article under Herman Goring
-- Philip Baird Shearer 18:15, 21 Jan 2005 (UTC)
Nobody's talking about endorsing any one method. The fact is that Goering is by about six-to-one the most common spelling in this case, however, and this is easily shown by using tools such as Google. While the tool isn't endorsed, this doesn't mean we ignore the facts just because Google is a good method for demonstrating them. If instead I go to my newspaper site, The Guardian, I find hundreds of Goerings and one Göring (from an article written by an actor who played the role in a play in which the German spelling was used). -- Tony Sidaway| Talk 19:16, 22 Jan 2005 (UTC)
[Moving this here from the other Göring discussion. / Uppland 20:04, 21 Jan 2005 (UTC)]
I would like to suggest a partial compromise which would work in anglicizing some names: I suggest that in those cases where a person has actually lived and worked in an English-speaking country and can be shown to have regularly used an English form of his or her name (i.e. not the occasional use inhibited by an English typewriter), the English form of the name would be fine. This has no relevance for the Göring article, but would solve the Masaryk article as T. G. Masaryk actually lived in England for some time. I checked the Times Digital Archive and got one hit for Tomas Masaryk, one for Tomás G. Masaryk, but 18 for Thomas Masaryk which is a more fully anglicized form, including the obituary from 1937. (There are more hits for Masaryk: "President Masaryk", "Professor Masaryk" etc.). Interestingly enough, the obituary for Eduard Beneš in 1948 spells Beneš in the Czech orthography (not plain Benes), but mentions Masaryk with the English Thomas.
I don't think we should expect any consistent application of Czech orthography from the typographers at The Times, but the last example shows that English press did not consistently strip diacritics from foreign names (the article also mentions a couple of accented Czech place-names), but also suggests that at least The Times at least on this occasion seems to have made the distinction I am proposing here between a person who had himself used his name in an English form and somebody who had not. It also shows that Tomas Masaryk, without diacritics, is really the worst alternative, being neither Czech nor English. / Uppland 18:34, 20 Jan 2005 (UTC)
I don't see how these guidelines aren't followed for Tucson... the only thing that seems to be missing is a redirect at Cuk Ṣon. I'll leave it to Node ue to make one :) -- Joy [shallot]
It's time to discard this policy. We say "Name your pages in English and place the native transliteration on the first line of the article unless the native form is more commonly used in English than the anglicized form", but this is just not being followed. Almost every time an argument comes up over it, the policy loses out.
Look at Wurttemberg, Riksdag, Goering, Tweede Kamer, Zurich (that's one's particularly ludicrous - walk up to the average person on the street in Auckland or New York or Sydney or London or Toronto etc and ask them to write "Zurich" on a piece of paper [so keyboards don't come into it] and they'll write "Zurich", not "Zürich"), etc, etc, etc, etc, etc. (The latest one is Spion Kop.)
I'm tired of getting in arguments with people, trying to apply this policy, only to have it ignored. It wouldn't be so upsetting if we just said "use the local name, with a redirect from the common English version, and mention the English version in the opening sentence", I would quite happily go along with that.
But it's really trying to have the policy say one thing, and then do something different in practise. It's time to document reality, which is that every page is done on a case-by-case basis, and it doesn't matter what name is most common in the English-speaking world. Unless people are prepared to actually follow this policy, and rename Zurich and all the rest, I'm going to change the page to give the policy actually followed in practise, which is "no uniform policy".
And no, this is not "disrupting Wikipedia to make a point". I am dead serious about changing this policy, because it's not what we are doing. Noel (talk) 19:31, 1 Feb 2005 (UTC)
You appear to be replying to Proteus, but I would very much like to hear your response to my original point.
The page says "use the most commonly used English version of the name for the article" (my emphasis). I don't think there's much question that the average person-in-the-street in the English-speaking world would spell it "Zurich", not "Zürich", but the article is at "Zürich", nonetheless (after lengthy debate, too). So clearly the policy described in the page does not describe the actual current state of affairs. I think this page should document our actual policy. Therefore, it needs to be changed to say something like "it's decided on a page-by-page basis".
(BTW, the Bombay page is now at Mumbai. Rather ironic, therefore, that you should have picked that as an example of a page to name using the "English name"....)
I care far less what policy we pick than that we pick one, document it accurately, and stick with it. Noel (talk) 22:20, 1 Feb 2005 (UTC)
If you read the rest of the sentence quoted above, it says "unless the native form is more commonly used in English than the anglicized form". That is a good policy, and I think it's usually followed. "Mumbai" follows it pretty well; Mumbai is now the normal name of the city (for example, on the BBC website), a change which I can remember happening in the last few years. It's a good policy and we should stick to it. (Mumbai is now no longer an example on this page). DJ Clayworth 22:35, 1 Feb 2005 (UTC)
Have you read anything I have written above about "keep it simple" and the reason for doing so? It is not incorrect to drop funny foreign squiggles when writing a word in English. The examples you give are just as correct if the are spelt as Attache, Naive, Encyclopaedia. An example I used before was El Nino which used to show up because one of the external links had the word "el-nino" as part of its link name. Now it does not, so that page will not be found by the many people in many English speaking countries. What is the point of producing a pages which limit access to them by English speaking people because some wikipeodia editors like funny foreign squiggles? Why not use the format for such pages as:
Using this format would fulfil both ease of access for most English people and an educational remit of an Encyclopaedia. At the moment there is often nothing to indicate why funny foreign squiggles are appearing on a word. If such a format was used then at least someone not familar with the word would learn that it was German, Spanish, French etc, which is not something most pages do at the moment.
At the very least people who like funny foreign squiggles on words, could respect the primary author and leave words alone if the appear without them.
I wonder why is it that there is a debate on umlauts on Zurich in the en.wikipedia and not in fr.wikipedia?
I do not think it is time to discard the policy but to strengthen and to modify it so that both usages are included. See my 5 point suggestion under "Keep it simple" in the #Comments above. Philip Baird Shearer 03:30, 3 Feb 2005 (UTC)
I'm tired too, Jnc. The reason we're not making progress is that people still refuse to make the distinction between transliteration and language. Dropping diacritics is a question of (English) orthography, not one of the English language Zurich is an English orthography for Zürich, but that doesn't make Zurich an Anglo-Saxon name.
Ionian Islands on the other hand is the English name for Ionia Nesia, which is the "English" diacritic-less orthography of the transliteration of Greek Iónia Nēsiá, in the Greek alphabet Ιόνια Νησιά. Because there is an English name, in this case, we do not use Ionia Nesia. This (and nothing else) is what we mean by "use English".
Can we at least agree to separate these cases, and argue about two unrelated policies, one about English, and the other about diacritics. I just refuse continue to respond to comments that argue diacritics are un-English, and therefore "no diacritics" equals "English". So, no, Zürich is no breach of this policy, and "El Nino" is not "English for El Niño". dab (ᛏ) 15:31, 3 Feb 2005 (UTC)
I agree that the policy should be changed into 'Use Latin names and redirect from common English names' for both personal and geographic names in foreign-related topics. Naming an article in accordance with local phonetics wouldn't stop anyone from using its English counterpart for both referencing and hyperlinking, but it should provide a better insight into the culture of a particular region. Wiki is not paper and not an ASCII-based database of the past. I can't see why a (misspelled) anglicized name is better than a native name, because there are Mighty Redirects™.
A single potential trouble with diacritical marks is that every Unicode symbol can be combined with diacritic post-nominal codes, #769 to #869. This means that acute accent letters Á and Á look the same, but the corresponding Unicode character strings are different - in the first case, it's a single character Á but the second is a two-character combination of Á. The Wiki software should be updated to convert such double-character cases into a single-character symbol and also account for these differences when searching. Besides that, I don't see any other technical problem with diacritic or national symbols in article names, as long as they are Latin-based. -- DmitryKo 11:13, 13 Mar 2005 (UTC)
Does this and italso :
This last one is critical because when someone move the article to a name like El Niño they also change every instance of the name to that version, and if the first person does not a later editor does. This means that the page in the English Wikipedia does not show up in a search unless the search engine wraps "O" "OE" and Ö which not all English search engines do. (EG Google.co.nz or Google.co.uk). People who are not English of have a good grasp of English need to understand that for the vast majority of native English speaking people, diacritics are meaningless and are not used when searching for a word.
Using redirects for "El Nino" to a page called "El Niño" is not sufficient because the redirect "El Nino" does not show up in a search external to Wikipedia. For this to happen the text must be embedded in the page. Philip Baird Shearer 15:53, 13 Mar 2005 (UTC)
Which Google are you using? http://www.google.com.au http://www.google.co.nz http://www.google.co.uk http://www.google.co.za all work the same way, they differentiate on diacritics eg a search on "El Nino" and "El Niño" return diffrent pages. http://www.google.ie http://www.google.ca seem to be set up as bylingual (one of which uses diaeresis) Germany http://www.google.de returns similar results to google.ca and google.ie. So it is a perceived cultural diffrence by Google not a technical one. Using google.co.uk:
With +wikipidia the "El Niño" page is only picked up because of the HTTP link in the external link "NOAA explanation" includes the string "el-nino-story". If I use another one like "Second battle of Zurich" UK Google does not pick up the main Wikipedia page (because all instences of Zurich have been changed to Zürich):
-- Philip Baird Shearer 02:36, 14 Mar 2005 (UTC)
El Niño | El Nino | |
---|---|---|
non diacritic-aware | ||
Web | 3,380,000 | 1,640,000 |
English | 1,240,000 | 1,640,000 |
en.wikipedia.org | 51 | 26 |
diacritic-aware | ||
Web | 3,000,000 | 2,970,000 |
English | 2,180,000 | 2,150,000 |
en.wikipedia.org | 71 | 71 |
Battle of Zurich | Battle of Zürich | |
non diacritic-aware | ||
Web | 212,000 | 61,500 |
English | 206,000 | 27,900 |
en.wikipedia.org | 29 | 18 |
diacritic-aware | ||
Web | 212,000 | 66,100 |
English | 206,000 | 29,900 |
en.wikipedia.org | 23 | 13 |
Just try using http://www.google.co.uk don't set any defaults because most users will not do so. The run the test.
The trouble with adding the common English names is that you have to do it and the chances are that pedants will remove them again. Arguing that it is not the most common English version see the Zurich talk page. Besides if it is the most common English version why not use it for the Page name?
If we could agree that the format should be El Nino ( Spanish: El Niño) the problem would be "fixed". Your format does almost the same but it does not include the additional information that it is a Spanish word and that there is an article in the native language for those who are interested: See Battle of Hurtgen Forest for an example. Philip Baird Shearer 13:47, 14 Mar 2005 (UTC)
This is supposed to be an ENGLISH encyclopedia not an International one. Diacritic may have originated as a Greek word but it is used in English as an English word. If you were to read the diacritic page you would see that there are no diacritics in modern English unless it is on a borrowed word which had not been fully integrated in to English. One of the steps of integration is to strip off the diacritics. So please explain how That was dispelled about five minutes after you'd first brought it up.
It is not a technical issue it is a cultural one. As Noel said in the second paragraph of this article walk up to the average person on the street in Auckland or New York or Sydney or London or Toronto etc and ask them to write "Zurich" on a piece of paper [so keyboards don't come into it] and they'll write "Zurich", not "Zürich". Ask then to spell El Nino and very few would write El Niño. Name your pages in English and place the native transliteration on the first line of the article unless the native form is more commonly used in English than the anglicized form
If the words are first spelt without diacritics and then explained in brackets where the borrowed word comes from and alternative spellings after that, then if it is used with or without diacritics throughout the rest of the article it has been defined at the top and is self referencing. Several other things things have been achieved.
"The vast majority of English speakers would be able to find the word or phrase": this is the "argument" I'm referring to as "dispelled after five minutes". finding an article is a technical question, not a cultural one, and the problem is solved via redirects, so let's not bring that up again. Let's also not get wound up with the Zurich case, which has become the prototypical borderline case: Yes I agree it could be at Zurich, both solutions have been convincingly argued, including use of diacritics in major English language publications. Both would be right, ok? I'm not objecting to Zurich. What I'm objecting to is your apparent insistence that it is not possible to write about non-English names in the English language. Get it? names, not ordinary words. Philip is not an English name. Æðelflæð and Byrhtnoþ, on the other hand, are , very much, English names. After all this time, you still confuse orthography with language? this policy is not about diacritics, it is about cases Giovanni vs. John, or Roma vs. Rome, so why don't we just argue about these? As for "walking up to people in the street and asking them to spell something" are you serious?? So why am I spending time fixing factual errors in articles if what we're aiming at is not the correct information, but the most widespread misconception?
dab
(ᛏ) 12:14, 15 Mar 2005 (UTC)
Before the system went into readonly mode yesterday, I had typed out a long reply to Dbachmann's last post. But on reflection (apart from stressing, I was talking about using search engines like "Ask Jeeves" 'outside Wikipedia) I am just repeating things which have already been written higher up the page. So I suggest that Dabchmann and I give it a rest and let some others who are interested in the subject contribute. Philip Baird Shearer 13:48, 17 Mar 2005 (UTC)
I have nothing against having the word with diacritic on the first line of an article if some people spell it that way providing that there is also a diacritic free version of the word or phrase. Indeed I would like to change this guide line to recommend that both version must appear on the first line of an article for words or phrases which can be spelt with diacritics. What I object to is the removal of all references to a diacritic free version of a word in the text of a page because a word with diacritics is "correct" and one without is "incorrect". If the problem were like AE v. CE English then it would be solved by primary author, but with numerous cases copy-editors ignore the initial diacritic free version and change it to include a diacritic version. (and I am sure that some copy-editors do it the other way around, but I have not seen any articles so modified as they have stayed around for long). Philip Baird Shearer 15:02, 31 Mar 2005 (UTC)
That last edit implies that "English name" and "Anglicized spelling" are synonyms. Confusing, because that's not necessarily correct.
The repetition of two similar but subtly different conventions is also confusing. E.g., are titling and naming a page the same or different? — Michael Z. 2005-04-7 14:32 Z
"Take two" is now in place. Lets discuss that as it is an attempt to put in place what there before the recent changes which were not agreed upon before they were made on this talk page. Philip Baird Shearer 16:39, 7 Apr 2005 (UTC)
This page is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |