Windows code page

Windows code pages are sets of characters or code pages used in Microsoft Windows from the 1980s and 1990s. Windows code pages were gradually superseded when Unicode was implemented in Windows, although they are still supported both within Windows and other platforms.
There are two groups of code pages in Windows systems: OEM and ANSI code pages. Code pages in both of these groups are extended ASCII code pages.

ANSI code page

ANSI code pages are used for native non-Unicode applications using a graphical user interface on Windows systems. ANSI is a misnomer because the behavior does not exactly match ANSI and because many non-ANSI standard encodings are allowed in these 8-bit code pages, most recently UTF-8 Unicode.
Most legacy "ANSI" code pages have code page numbers in the pattern 125x. However, 874 and the East Asian multi-byte "ANSI" code pages, all of which are also used as OEM code pages, are numbered to match similar IBM encodings. While code page 1258 is also used as an OEM code page, it is original to Microsoft rather than an extension to an existing encoding. IBM have assigned their own, different numbers for Microsoft's variants, these are given for reference in the lists below where applicable.
All of the 125x Windows code pages, as well as 874 and 936, are labelled by Internet Assigned Numbers Authority as "Windows-number", although "Windows-936" is treated as a synonym for "GBK". Windows code page 932 is instead labelled as "Windows-31J".
ANSI Windows code pages, and especially the code page 1252, were so-called since they were purportedly based on drafts submitted or intended for ANSI. However, ANSI and ISO have not standardized any of these code pages. Instead they are either:

Supersets of the standard sets such as those of ISO 8859 and the various national standards,
Major modifications of these
Having no parallel encoding. Also, Windows-1251 follows neither the ISO-standardised ISO-8859-5 nor the then-prevailing KOI-8.

About twelve of the typography and business characters from CP1252 at code points 0x80–0x9F are present in many other ANSI/Windows code pages at the same codes.

OEM code page

The OEM code pages are used by Win32 console applications, and by virtual DOS, and can be considered a holdover from DOS and the original IBM PC architecture. A separate suite of code pages was implemented not only due to compatibility, but also because the fonts of VGA hardware suggest encoding of line drawing characters to be compatible with code page 437. Most OEM code pages share many code points, particularly for non-letter characters, with the second half of CP437.
A typical OEM code page, in its second half, does not resemble any ANSI/Windows code page even roughly. Nevertheless, two single-byte, fixed-width code pages and four multibyte CJK code pages are used as both OEM and ANSI code pages. Code page 1258 uses combining diacritics, as Vietnamese requires more than 128 letter-diacritic combinations. This is in contrast to VISCII, which replaces some of the C0 control codes.

History

Initially, computer systems and system programming languages did not make a distinction between characters and bytes. This led to much confusion subsequently. Microsoft software and systems previous to the Windows NT line are examples of this, using the OEM and ANSI code pages, which do not make the distinction.
Since the late 1990s, software and systems are increasingly adopting more direct encodings of Unicode, in particular UTF-8 and UTF-16; this trend has been improved by the widespread adoption of XML, which provides a more adequate mechanism for labelling the encoding used. Recent Microsoft products and application program interfaces use Unicode internally, but many applications and APIs continue to use the default encoding of the computer's locale when reading and writing text data to files or standard output. Therefore, though Unicode is the accepted standard, there is still backwards compatibility with the older Windows code pages.
The euro sign was introduced after many of the ANSI and OEM code pages were introduced; several code pages were revised to contain the euro sign.
Since version 1803, Windows machines can be configured to allow UTF-8 as the "ANSI" and OEM codepage.

List

The following Windows code pages exist:

Windows-125x series

These nine code pages are all extended ASCII 8-bit SBCS encodings, and were designed by Microsoft for use as ANSI codepages on Windows. They are commonly known by their IANA-registered names as windows-<number>, but are also sometimes called cp<number>, "cp" for "code page". They are all used as ANSI code pages; Windows-1258 is also used as an OEM code page.

ID	Description	Relationship to ISO 8859 or other established encodings
1250	Latin 2 / Central European
1251	Cyrillic
1252	Latin 1 / Western European
1253	Greek
1254	Turkish
1255	Hebrew
1256	Arabic
1257	Baltic
1258	Vietnamese

DOS code pages

These are also ASCII-based. Most of these are included for use as OEM code pages; code page 874 is also used as an ANSI code page.

437 – IBM PC US, 8-bit SBCS extended ASCII. Known as OEM-US, the encoding of the primary built-in font of VGA graphics cards.
708 – Arabic, extended ISO 8859-6
720 – Arabic, retaining box drawing characters in their usual locations
737 – "MS-DOS Greek". Retains all box drawing characters. More popular than 869.
775 – "MS-DOS Baltic Rim"
850 – "MS-DOS Latin 1". Full repertoire of ISO 8859-1.
852 – "MS-DOS Latin 2"
855 – "MS-DOS Cyrillic". Mainly used for South Slavic languages. Includes repertoire of ISO-8859-5. Not to be confused with cp866.
857 – "MS-DOS Turkish"
858 – Western European with euro sign
860 – "MS-DOS Portuguese"
861 – "MS-DOS Icelandic"
862 – "MS-DOS Hebrew"
863 – "MS-DOS French Canada"
864 – Arabic
865 – "MS-DOS Nordic"
866 – "MS-DOS Cyrillic Russian", cp866. Sole purely OEM code page included as a legacy encoding in WHATWG Encoding Standard for HTML5.
869 – "MS-DOS Greek 2", IBM869. Full repertoire of ISO 8859-7.
874 – Thai, also used as the ANSI code page, extends ISO 8859-11 with a few additional characters from Windows-1252. Corresponds to IBM code page 1162.
East Asian multi-byte code pages

These often only partly match the IBM code pages of the same number: code pages 932, 936 and 949 differ from the IBM code pages of the same number, whereas Windows-951, as part of a kludge, is unrelated to IBM-951. IBM equivalent code pages are given in the second column. Code pages 932, 936, 949 and 950/951 are used as both ANSI and OEM code pages on the locales in question.

ID	IBM Equivalent	Language	Encoding	Use
932	943	Japanese	Shift JIS	ANSI/OEM
936	1386	Chinese	GBK	ANSI/OEM
949	1363	Korean	Unified Hangul Code	ANSI/OEM
950	1370, 1373	Chinese	Big5	ANSI/OEM
951	5471	Chinese	Big5-HKSCS	ANSI/OEM

A few further multiple-byte code pages are supported for decoding or encoding using operating system libraries, but not used as either sort of system encoding in any locale.

ID	IBM Equivalent	Language	Encoding	Use
1361	-	Korean	Johab	Conversion
20000	964	Chinese	CNS 11643	Conversion
20001	-	Chinese	TCA	Conversion
20002	-	Chinese	Big5	Conversion
20003	?	Chinese	IBM 5500	Conversion
20004	-	Chinese	Teletext	Conversion
20005	-	Chinese	Wang	Conversion
20932	954	Japanese	EUC-JP	Conversion

EBCDIC code pages

37 – IBM EBCDIC US-Canada, 8-bit SBCS
500 – Latin 1
870 – IBM870
875 – cp875
1026 – EBCDIC Turkish
1047 – IBM01047 – Latin 1
1140 – IBM01141
1141 – IBM01141
1142 – IBM01142
1143 – IBM01143
1144 – IBM01144
1145 – IBM01145
1146 – IBM01146
1147 – IBM01147
1148 – IBM01148
1149 – IBM01149
20273 – EBCDIC Germany
20277 – EBCDIC Denmark/Norway
20278 – EBCDIC Finland/Sweden
20280 – EBCDIC Italy
20284 – EBCDIC Latin America/Spain
20285 – EBCDIC United Kingdom
20290 – EBCDIC Japanese
20297 – EBCDIC France
20420 – EBCDIC Arabic
20423 – EBCDIC Greek
20424 – x-EBCDIC-KoreanExtended
20833 – Korean
20838 – EBCDIC Thai
21025 – EBCDIC Cyrillic
20871 – EBCDIC Icelandic
20880 – EBCDIC Cyrillic
20905 – EBCDIC Turkish
21027 – Japanese EBCDIC
Unicode-related code pages
1200 – Unicode. Available only to managed applications
1201 – Unicode. Available only to managed applications
12000 – utf-32. Available only to managed applications
12001 – utf-32 Big endian. Available only to managed applications
65000 – Unicode
65001 – Unicode
Macintosh compatibility code pages
10000 – Apple Macintosh Roman
10001 – Apple Macintosh Japanese
10002 – Apple Macintosh Chinese
10003 – Apple Macintosh Korean
10004 – Apple Macintosh Arabic
10005 – Apple Macintosh Hebrew
10006 – Apple Macintosh Greek
10007 – Apple Macintosh Cyrillic
10008 – Apple Macintosh Chinese
10010 – Apple Macintosh Romanian
10017 – Apple Macintosh Ukrainian
10021 – Apple Macintosh Thai
10029 – Apple Macintosh Roman II / Central Europe
10079 – Apple Macintosh Icelandic
10081 – Apple Macintosh Turkish
10082 – Apple Macintosh Croatian
ISO 8859 code pages
28591 – ISO-8859-1 – Latin-1
28592 – ISO-8859-2 – Latin-2
28593 – ISO-8859-3 – Latin-3 or South European
28594 – ISO-8859-4 – Latin-4 or North European
28595 – ISO-8859-5 – Latin/Cyrillic
28596 – ISO-8859-6 – Latin/Arabic
28597 – ISO-8859-7 – Latin/Greek
28598 – ISO-8859-8 – Latin/Hebrew
28599 – ISO-8859-9 – Latin-5 or Turkish
28600 – ISO-8859-10 – Latin-6
28601 – ISO-8859-11 – Latin/Thai
28602 – ISO-8859-12 – reserved for Latin/Devanagari but abandoned
28603 – ISO-8859-13 – Latin-7 or Baltic Rim
28604 – ISO-8859-14 – Latin-8 or Celtic
28605 – ISO-8859-15 – Latin-9
28606 – ISO-8859-16 – Latin-10 or South-Eastern European
38596 – ISO-8859-6- – Latin/Arabic
38598 – ISO-8859-8- – Latin/Hebrew
ITU-T code pages
20105 – 7-bit IA5 IRV
20106 – 7-bit IA5 German
20107 – 7-bit IA5 Swedish
20108 – 7-bit IA5 Norwegian
20127 – 7-bit US-ASCII
20261 – T.61
20269 – ISO-6937
KOI8 code pages
20866 – Russian - KOI8-R
21866 – Ukrainian - KOI8-U
Other code pages
20924 – IBM00924
20936 – x-cp20936
20949 – x-cp20949
Problems arising from the use of code pages

Microsoft strongly recommends using Unicode in modern applications, but many applications or data files still depend on the legacy code pages.

Programs need to know what code page to use in order to display the contents of files correctly. If a program uses the wrong code page it may show text as mojibake.
The code page in use may differ between machines, so files created on one machine may be unreadable on another.
Data is often improperly tagged with the code page, or not tagged at all, making determination of the correct code page to read the data difficult.
These Microsoft code pages differ to various degrees from some of the standards and other vendors' implementations. This isn't a Microsoft issue per se, as it happens to all vendors, but the lack of consistency makes interoperability with other systems unreliable in some cases.
The use of code pages limits the set of characters that may be used.
Characters expressed in an unsupported code page may be converted to question marks or other replacement characters, or to a simpler version. In either case, the original character may be lost.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...

Windows code page

ANSI code page

OEM code page

History

List

Windows-125x series

DOS code pages

East Asian multi-byte code pages

EBCDIC code pages

Unicode-related code pages

Macintosh compatibility code pages

ISO 8859 code pages

ITU-T code pages

KOI8 code pages

Other code pages

Problems arising from the use of code pages