[go: up one dir, main page]

Jump to content

ISO/IEC 8859-9

From Wikipedia, the free encyclopedia
ISO/IEC 8859-9
MIME / IANAISO-8859-9
Alias(es)iso-ir-148, latin5, l5, csISOLatin5[1]
StandardTS 5881, ECMA-128, ISO/IEC 8859
ClassificationISO 8859 (extended ASCII, ISO 4873 level 1)
ExtendsUS-ASCII
Based onISO/IEC 8859-1
Preceded byISO/IEC 8859-3
Other related encoding(s)Windows-1254

ISO/IEC 8859-9:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 9: Latin alphabet No. 5, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1989. It is designated ECMA-128 by Ecma International and TS 5881 as a Turkish standard.[2] It is informally referred to as Latin-5 or Turkish. It was designed to cover the Turkish language (and the vast majority of users use it for that language, even though it can also be used for some other languages), designed as being of more use than the ISO/IEC 8859-3 encoding. It is identical to ISO/IEC 8859-1 except for the replacement of six Icelandic characters (Ðð, Ýý, Þþ) with characters unique to the Turkish alphabet (Ğğ, İ, ı, Şş). And the uppercase of i is İ; the lowercase of I is ı.

ISO-8859-9 is the IANA preferred charset name for this standard when supplemented with the C0 and C1 control codes from ISO/IEC 6429. In modern applications Unicode and UTF-8 are preferred; authors of new web pages and the designers of new protocols are instructed to use UTF-8 instead.[3] Since 2023, less than 0.05% of all web pages use ISO-8859-9,[4][5] while 2.1% of web pages located in Turkey declare use of ISO-8859-9.[6] However, the WHATWG Encoding Standard, which specifies the character encodings which are permitted in HTML5 and which compliant browsers must support,[7] requires that web pages marked as ISO-8859-9 be handled as Windows-1254,[3] which differs from ISO-8859-9 by using the CR range which ISO-8859-9 reserves for C1 control codes for additional graphical characters instead (analogous to the relationship between ISO-8859-1 and Windows-1252).

Microsoft has assigned code page 28599 a.k.a. Windows-28599 to ISO-8859-9 in Windows. IBM has assigned code page 920 (CCSID 920) to ISO-8859-9.[8][9] It is published by Ecma International as ECMA-128.[10]

Codepage layout

[edit]

Differences from ISO-8859-1 have the Unicode code point number below the character.

ISO/IEC 8859-9[11][12][13]
0 1 2 3 4 5 6 7 8 9 A B C D E F
0x
1x
2x  SP  ! " # $ % & ' ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z [ \ ] ^ _
6x ` a b c d e f g h i j k l m n o
7x p q r s t u v w x y z { | } ~
8x
9x
Ax NBSP ¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ SHY ® ¯
Bx ° ± ² ³ ´ µ · ¸ ¹ º » ¼ ½ ¾ ¿
Cx À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï
Dx Ğ
011E
Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü İ
0130
Ş
015E
ß
Ex à á â ã ä å æ ç è é ê ë ì í î ï
Fx ğ
011F
ñ ò ó ô õ ö ÷ ø ù ú û ü ı
0131
ş
015F
ÿ

See also

[edit]

References

[edit]
  1. ^ Character Sets, Internet Assigned Numbers Authority (IANA), 2018-12-12
  2. ^ "Latin-5: A list of the Latin-5 client and server CCSIDs, which includes Turkey". IBM. Archived from the original on 2022-02-13.
  3. ^ a b van Kesteren, Anne. "Names and labels". Encoding Standard. WHATWG.
  4. ^ "Historical trends in the usage of character encodings for websites". w3techs.com.
  5. ^ "Frequently Asked Questions". w3techs.com.
  6. ^ "Distribution of character encodings among websites that use Turkey". w3techs.com.
  7. ^ "8.2.2.3. Character encodings". HTML 5.1 2nd Edition. W3C. User agents must support the encodings defined in the WHATWG Encoding standard, including, but not limited to […]
  8. ^ "Code page 920 information document". Archived from the original on 2017-01-16.
  9. ^ "CCSID 920 information document". Archived from the original on 2016-03-27.
  10. ^ Standard ECMA-128: 8-Bit Single-Byte Coded Graphic Character Sets - Latin Alphabet No. 5 (2nd ed.). 1999. This Ecma publication is also approved as ISO 8859-9.
  11. ^ Code Page CPGID 00920 (pdf) (PDF), IBM
  12. ^ Code Page CPGID 00920 (txt), IBM
  13. ^ International Components for Unicode (ICU), ibm-920_P100-1995.ucm, 2002-12-03
[edit]