![]() | The topic of this article may not meet Wikipedia's
general notability guideline. (June 2022) |
The Latin-derived letters dotted İ i and dotless I ı, which are distinct letters in the alphabets of a number of Turkic languages, unlike in English and most languages using the Latin script, have caused some issues in computing.
![]() | This section needs expansion. You can help by
adding to it. (September 2022) |
Unicode does not encode the uppercase form of dotless I and lowercase form of dotted İ separately from their base letters, and instead merges them with the upper and lower case forms of the Latin letter I respectively. John Cowan proposed disunification of plain Ii as capital letter dotless I and small letter I with dot above to make the casing more consistent. [1] The Unicode Technical Committee had previously rejected a similar proposal [2] because it would corrupt mapping from character sets with dotted and dotless I and corrupt data in these languages.[ citation needed]
Most Unicode software uppercases ı to I, but, unless specifically configured for Turkish, it lowercases I to i. Thus uppercasing then lowercasing changes the letters. Likewise, most Unicode software uppercases i to I, changing the letter in the process.
In the Microsoft Windows SDK, beginning with Windows Vista, several relevant functions have a NORM_LINGUISTIC_CASING flag, to indicate that for Turkish and Azerbaijani locales, I should map to ı.
In the
LaTeX typesetting language the dotless ı can be written with the backslash-i command: \i
.
Dotted İ and dotless ı are problematic in the Turkish locales of several software packages, including Oracle DBMS, PHP, Java (software platform), [3] [4] and Unixware 7, where implicit capitalization of names of keywords, variables, and tables has effects not foreseen by the application developers. The C or US English locales do not have these problems. The .NET Framework has special provisions to handle the 'Turkish i'. [5]
Many cellphones available in Turkey (as of 2008) lacked a proper localization, which led to replacing ı by i in SMS, sometimes severely distorting the sense of a text. In one instance, a miscommunication played a role in the deaths of Emine and Ramazan Çalçoban in 2008. [6] [7] A common substitution is to use the character 1 for dotless ı. This is also common in Azerbaijan (see also translit), but the meaning of words is generally understood.
In some Ectaco translators, the letter İ was also treated as I (e.g. TRAFIK ⟨traffic⟩, when it is normally TRAFİK).
Preview | I | i | İ | ı | ||||
---|---|---|---|---|---|---|---|---|
Unicode name | LATIN CAPITAL LETTER I | LATIN SMALL LETTER I | LATIN CAPITAL LETTER I WITH DOT ABOVE |
LATIN SMALL LETTER DOTLESS I | ||||
Encodings | decimal | hex | dec | hex | dec | hex | dec | hex |
Unicode | 73 | U+0049 | 105 | U+0069 | 304 | U+0130 | 305 | U+0131 |
UTF-8 | 73 | 49 | 105 | 69 | 196 176 | C4 B0 | 196 177 | C4 B1 |
Numeric character reference | I |
I |
i |
i |
İ |
İ |
ı |
ı |
Named character reference | İ | ı, ı | ||||||
ISO 8859-9 | 73 | 49 | 105 | 69 | 221 | DD | 253 | FD |
ISO 8859-3 | 73 | 49 | 105 | 69 | 169 | A9 | 185 | B9 |
{{
cite news}}
: CS1 maint: unfit URL (
link)
![]() | The topic of this article may not meet Wikipedia's
general notability guideline. (June 2022) |
The Latin-derived letters dotted İ i and dotless I ı, which are distinct letters in the alphabets of a number of Turkic languages, unlike in English and most languages using the Latin script, have caused some issues in computing.
![]() | This section needs expansion. You can help by
adding to it. (September 2022) |
Unicode does not encode the uppercase form of dotless I and lowercase form of dotted İ separately from their base letters, and instead merges them with the upper and lower case forms of the Latin letter I respectively. John Cowan proposed disunification of plain Ii as capital letter dotless I and small letter I with dot above to make the casing more consistent. [1] The Unicode Technical Committee had previously rejected a similar proposal [2] because it would corrupt mapping from character sets with dotted and dotless I and corrupt data in these languages.[ citation needed]
Most Unicode software uppercases ı to I, but, unless specifically configured for Turkish, it lowercases I to i. Thus uppercasing then lowercasing changes the letters. Likewise, most Unicode software uppercases i to I, changing the letter in the process.
In the Microsoft Windows SDK, beginning with Windows Vista, several relevant functions have a NORM_LINGUISTIC_CASING flag, to indicate that for Turkish and Azerbaijani locales, I should map to ı.
In the
LaTeX typesetting language the dotless ı can be written with the backslash-i command: \i
.
Dotted İ and dotless ı are problematic in the Turkish locales of several software packages, including Oracle DBMS, PHP, Java (software platform), [3] [4] and Unixware 7, where implicit capitalization of names of keywords, variables, and tables has effects not foreseen by the application developers. The C or US English locales do not have these problems. The .NET Framework has special provisions to handle the 'Turkish i'. [5]
Many cellphones available in Turkey (as of 2008) lacked a proper localization, which led to replacing ı by i in SMS, sometimes severely distorting the sense of a text. In one instance, a miscommunication played a role in the deaths of Emine and Ramazan Çalçoban in 2008. [6] [7] A common substitution is to use the character 1 for dotless ı. This is also common in Azerbaijan (see also translit), but the meaning of words is generally understood.
In some Ectaco translators, the letter İ was also treated as I (e.g. TRAFIK ⟨traffic⟩, when it is normally TRAFİK).
Preview | I | i | İ | ı | ||||
---|---|---|---|---|---|---|---|---|
Unicode name | LATIN CAPITAL LETTER I | LATIN SMALL LETTER I | LATIN CAPITAL LETTER I WITH DOT ABOVE |
LATIN SMALL LETTER DOTLESS I | ||||
Encodings | decimal | hex | dec | hex | dec | hex | dec | hex |
Unicode | 73 | U+0049 | 105 | U+0069 | 304 | U+0130 | 305 | U+0131 |
UTF-8 | 73 | 49 | 105 | 69 | 196 176 | C4 B0 | 196 177 | C4 B1 |
Numeric character reference | I |
I |
i |
i |
İ |
İ |
ı |
ı |
Named character reference | İ | ı, ı | ||||||
ISO 8859-9 | 73 | 49 | 105 | 69 | 221 | DD | 253 | FD |
ISO 8859-3 | 73 | 49 | 105 | 69 | 169 | A9 | 185 | B9 |
{{
cite news}}
: CS1 maint: unfit URL (
link)