Asuka codes of characters. ASCII encoding (American standard code for information interchange) - basic text encoding for the Latin alphabet

Each computer has its own set of characters that it implements. This set contains 26 upper and lowercase letters, numbers and special characters (dot, space, etc.). When converted to integers, symbols are called codes. Standards were developed so that computers would have the same sets of codes.

ASCII standard

ASCII (American Standard Code for Information Interchange) is an American standard code for information exchange. Each ASCII character has 7 bits, so the maximum number of characters is 128 (Table 1). Codes 0 to 1F are control characters and are not printed. Many non-printable ASCII characters are needed to transmit data. For example, a message may consist of the start-of-header character SOH, the header itself and the start-of-text character STX, the text itself and the end-of-text character ETX, and the end-of-transmission character EOT. However, data over the network is transmitted in packets, which themselves are responsible for the beginning and end of the transmission. So non-printable characters are almost never used.

Table 1 - ASCII code table

Number Command Meaning Number Command Meaning

0	NUL	Null pointer	10	DLE	Exit from the transmission system
1	SOH	start of title	11	DC1	Device management
2	STX	Beginning of text	12	DC2	Device management
3	ETX	End of text	13	DC3	Device management
4	EOT	End of transmission	14	DC4	Device management
5	ACK	Request	15	N.A.K.	Non-confirmation of reception
6	BEL	Acceptance confirmation	16	SYN	Simple
7	B.S.	Bell symbol	17	ETB	End of transmission block
8	HT	Step back	18	CAN	Mark
9	LF	Horizontal tabulation	19	E.M.	End of media
A	VT	Line feed	1A	SUB	Subscript
B	FF	Vertical tab	1B	ESC	Exit
C	CR	Page translation	1C	FS	File separator
D	SO	Carriage return	1D	G.S.	Group separator
E	S.I.	Switch to additional register	1E	R.S.	Record separator
	S.I.	Switch to standard case	1F	US	Module separator

Number Symbol Number Symbol Number Symbol Number Symbol Number Symbol Number Symbol

20	space	30	0	40	@	50	P	60	.	70	p
21	!	31	1	41	A	51	Q	61	a	71	q
22	‘	32	2	42	B	52	R	62	b	72	r
23	#	33	3	43	C	53	S	63	c	73	s
24	φ	34	4	44	D	54	T	64	d	74	t
25	%	35	5	45	E	55	AND	65	e	75	And
26	&	36	6	46	F	56	V	66	f	76	v
27	‘	37	7	47	G	57	W	67	g	77	w
28	(	38	8	48	H	58	X	68	h	78	x
29	)	39	9	49	I	59	Y	69	i	70	y
2A	‘	3A	;	4A	J	5A	Z	6A	j	7A	z
2B	+	3B	;	4B	K	5B	[	6B	k	7B	{
2C	‘	3C	<	4C	L	5C	\	6C	l	7C	\|
2D	—	3D	=	4D	M	5D	]	6D	m	7D	}
2E		3E	>	4E	N	5E	—	6E	n	7E	~
2F	/	3F	g	4F	O	5F	_	6F	o	7F	DEL

Unicode standard

The previous encoding is great for English, but it is not convenient for other languages. For example, German has umlauts, and French has superscripts. Some languages have completely different alphabets. The first attempt at extending ASCII was IS646, which extended the previous encoding by an additional 128 characters. Latin letters with strokes and diacritics were added, and received the name - Latin 1. The next attempt was IS 8859 - which contained a code page. There were also attempts at extensions, but this was not universal. UNICODE encoding was created (is 10646). The idea behind the encoding is to assign each character a single constant 16-bit value, which is called a code pointer. In total there are 65536 pointers. To save space, we used Latin-1 for codes 0 -255, easily changing ASII to UNICODE. This standard solved many problems, but not all. Due to the arrival of new words, for example, for the Japanese language, the number of terms needs to be increased by about 20 thousand. Braille also needs to be included.

[8-bit encodings: ASCII, KOI-8R and CP1251] The first encoding tables created in the USA did not use the eighth bit in a byte. The text was represented as a sequence of bytes, but the eighth bit was not taken into account (it was used for official purposes).

The ASCII (American Standard Code for Information Interchange) table has become a generally accepted standard. The first 32 characters of the ASCII table (00 to 1F) were used for non-printing characters. They were designed to control a printing device, etc. The rest - from 20 to 7F - are regular (printable) characters.

Table 1 - ASCII encoding

Dec Hex Oct Char Description

0	0	000		null
1	1	001		start of heading
2	2	002		start of text
3	3	003		end of text
4	4	004		end of transmission
5	5	005		inquiry
6	6	006		acknowledge
7	7	007		bell
8	8	010		backspace
9	9	011		horizontal tab
10	A	012		new line
11	B	013		vertical tab
12	C	014		new page
13	D	015		carriage return
14	E	016		shift out
15	F	017		shift in
16	10	020		data link escape
17	11	021		device control 1
18	12	022		device control 2
19	13	023		device control 3
20	14	024		device control 4
21	15	025		negative acknowledge
22	16	026		synchronous idle
23	17	027		end of trans. block
24	18	030		cancel
25	19	031		end of medium
26	1A	032		substitute
27	1B	033		escape
28	1C	034		file separator
29	1D	035		group separator
30	1E	036		record separator
31	1F	037		unit separator
32	20	040		space
33	21	041	!
34	22	042	"
35	23	043	#
36	24	044	$
37	25	045	%
38	26	046	&
39	27	047	"
40	28	050	(
41	29	051	)
42	2A	052	*
43	2B	053	+
44	2C	054	,
45	2D	055	-
46	2E	056	.
47	2F	057	/
48	30	060	0
49	31	061	1
50	32	062	2
51	33	063	3
52	34	064	4
53	35	065	5
54	36	066	6
55	37	067	7
56	38	070	8
57	39	071	9
58	3A	072	:
59	3B	073	;
60	3C	074	<
61	3D	075	=
62	3E	076	>
63	3F	077	?

Dec Hex Oct Char

64	40	100	@
65	41	101	A
66	42	102	B
67	43	103	C
68	44	104	D
69	45	105	E
70	46	106	F
71	47	107	G
72	48	110	H
73	49	111	I
74	4A	112	J
75	4B	113	K
76	4C	114	L
77	4D	115	M
78	4E	116	N
79	4F	117	O
80	50	120	P
81	51	121	Q
82	52	122	R
83	53	123	S
84	54	124	T
85	55	125	U
86	56	126	V
87	57	127	W
88	58	130	X
89	59	131	Y
90	5A	132	Z
91	5B	133	[
92	5C	134	\
93	5D	135	]
94	5E	136	^
95	5F	137	_
96	60	140	`
97	61	141	a
98	62	142	b
99	63	143	c
100	64	144	d
101	65	145	e
102	66	146	f
103	67	147	g
104	68	150	h
105	69	151	i
106	6A	152	j
107	6B	153	k
108	6C	154	l
109	6D	155	m
110	6E	156	n
111	6F	157	o
112	70	160	p
113	71	161	q
114	72	162	r
115	73	163	s
116	74	164	t
117	75	165	u
118	76	166	v
119	77	167	w
120	78	170	x
121	79	171	y
122	7A	172	z
123	7B	173	{
124	7C	174	\|
125	7D	175	}
126	7E	176	~
127	7F	177	DEL

As you can easily see, this encoding contains only Latin letters, and those that are used in the English language. There are also arithmetic and other service symbols. But there are neither Russian letters, nor even special Latin ones for German or French. This is easy to explain - the encoding was developed specifically as an American standard. As computers began to be used throughout the world, other characters needed to be encoded.

To do this, it was decided to use the eighth bit in each byte. This made 128 more values available (from 80 to FF) that could be used to encode characters. The first of the eight-bit tables - “extended ASCII” ( Extended ASCII) - included various variants of Latin characters used in some languages of Western Europe. It also contained other additional symbols, including pseudographics.

Pseudographic characters allow you to provide some semblance of graphics by displaying only text characters on the screen. For example, the file management program FAR Manager works using pseudographics.

There were no Russian letters in the Extended ASCII table. Russia (formerly the USSR) and other countries created their own encodings that made it possible to represent specific “national” characters in 8-bit text files - Latin letters of the Polish and Czech languages, Cyrillic (including Russian letters) and other alphabets.

In all encodings that have become widespread, the first 127 characters (that is, the byte value with the eighth bit equal to 0) are the same as ASCII. So an ASCII file works in either of these encodings; The letters of the English language are represented in the same way.

The ISO organization (International Standardization Organization) has adopted the ISO 8859 group of standards. It defines 8-bit encodings for different language groups. So, ISO 8859-1 is an Extended ASCII table for the USA and Western Europe. And ISO 8859-5 is a table for the Cyrillic alphabet (including Russian).

However, for historical reasons, the ISO 8859-5 encoding did not take root. In reality, the following encodings are used for the Russian language:

Code Page 866 (CP866), aka “DOS”, aka “alternative GOST encoding”. Widely used until the mid-90s; now used to a limited extent. Practically not used for distributing texts on the Internet.
- KOI-8. Developed in the 70-80s. It is a generally accepted standard for transmitting email messages on the Russian Internet. It is also widely used in operating systems of the Unix family, including Linux. The Russian-language version of KOI-8 is called KOI-8R; There are versions for other Cyrillic languages (for example, KOI8-U is a version for the Ukrainian language).
- Code Page 1251, CP1251, Windows-1251. Developed by Microsoft to support the Russian language in Windows.

The main advantage of the CP866 was the preservation of pseudo-graphics characters in the same places as in Extended ASCII; therefore, foreign text programs, for example, the famous Norton Commander, could work without changes. The CP866 is now used for Windows programs running in text windows or full-screen text mode, including FAR Manager.

Texts in CP866 have been quite rare in recent years (but it is used to encode Russian file names in Windows). Therefore, we will dwell in more detail on two other encodings - KOI-8R and CP1251.

As you can see, in the CP1251 encoding table, Russian letters are arranged in alphabetical order (with the exception, however, of the letter E). This arrangement makes it very easy for computer programs to sort alphabetically.

But in KOI-8R the order of Russian letters seems random. But in reality this is not the case.

In many older programs, the 8th bit was lost when processing or transmitting text. (Now such programs are practically “extinct”, but in the late 80s - early 90s they were widespread). To get a 7-bit value from an 8-bit value, just subtract 8 from the most significant digit; for example, E1 becomes 61.

Now compare KOI-8R with the ASCII table (Table 1). You will find that Russian letters are placed in clear correspondence with Latin ones. If the eighth bit disappears, lowercase Russian letters turn into uppercase Latin letters, and uppercase Russian letters turn into lowercase Latin letters. So, E1 in KOI-8 is the Russian “A”, while 61 in ASCII is the Latin “a”.

So, KOI-8 allows you to maintain the readability of Russian text when the 8th bit is lost. “Hello everyone” becomes “pRIWET WSEM”.

Recently, both the alphabetical order of characters in the encoding table and readability with the loss of the 8th bit have lost their decisive importance. The eighth bit in modern computers is not lost during transmission or processing. And alphabetical sorting is done taking into account the encoding, and not by simply comparing codes. (By the way, the CP1251 codes are not completely arranged alphabetically - the letter E is not in its place).

Due to the fact that there are two common encodings, when working with the Internet (mail, browsing Web sites), you can sometimes see a meaningless set of letters instead of Russian text. For example, “I AM SBYUFEMHEL.” These are just the words “with respect”; but they were encoded in CP1251 encoding, and the computer decoded the text using the KOI-8 table. If the same words, on the contrary, were encoded in KOI-8, and the computer decoded the text according to the CP1251 table, the result would be “U HCHBTSEOYEN”.

Sometimes it happens that a computer deciphers Russian-language letters using a table not intended for the Russian language. Then, instead of Russian letters, a meaningless set of symbols appears (for example, Latin letters of Eastern European languages); they are often called “crocozybras”.

In most cases, modern programs cope with determining the encodings of Internet documents (emails and Web pages) independently. But sometimes they “misfire”, and then you can see strange sequences of Russian letters or “krokozyabry”. As a rule, in such a situation, to display real text on the screen, it is enough to select the encoding manually in the program menu.

Information from the page http://open-office.edusite.ru/TextProcessor/p5aa1.html was used for this article.

Material taken from the site:

Excel for Office 365 Word for Office 365 Outlook for Office 365 PowerPoint for Office 365 Publisher for Office 365 Excel 2019 Word 2019 Outlook 2019 PowerPoint 2019 OneNote 2016 Publisher 2019 Visio Professional 2019 Visio Standard 2019 Excel 2016 Word 2016 Outlook 2016 PowerPoint 2016 2013 Publisher 2016 Visio 2013 Visio Professional 2016 Visio Standard 2016 Excel 2013 Word 2013 Outlook 2013 PowerPoint 2013 Publisher 2013 Excel 2010 Word 2010 Outlook 2010 PowerPoint 2010 OneNote 2010 Publisher 2010 Visio 2010 Excel 2007 Word 2007 Outlook 20 07 PowerPoint 2007 Publisher 2007 Access 2007 Visio 2007 OneNote 2007 Office 2010 Visio Standard 2007 Visio Standard 2010 Less

In this article: Insert an ASCII or Unicode character into a document

If you only need to enter a few special characters or symbols, you can use keyboard shortcuts. For a list of ASCII characters, see the following tables or the article Inserting National Alphabets Using Keyboard Shortcuts.

Notes:

Inserting ASCII characters

To insert an ASCII character, press and hold the ALT key while entering the character code. For example, to insert a degree symbol (º), press and hold the ALT key, then enter 0176 on the numeric keypad.

To enter numbers, use the numeric keypad rather than the numbers on the main keyboard. If you need to enter numbers on the numeric keypad, make sure the NUM LOCK indicator is on.

Inserting Unicode Characters

To insert a Unicode character, enter the character code, then press ALT and X. For example, to insert a dollar symbol ($), enter 0024 and press ALT and X. For all Unicode character codes, see .

Important: Some Microsoft Office programs, such as PowerPoint and InfoPath, do not support converting Unicode codes to characters. If you need to insert a Unicode character in one of these programs, use .

Notes:

If the wrong Unicode character appears after you press ALT+X, select the correct code, and then press ALT+X again.

In addition, you must enter "U+" before the code. For example, if you enter "1U+B5" and press ALT+X, the text "1µ" will be displayed, and if you enter "1B5" and press ALT+X, the symbol "Ƶ" will be displayed.

Using the symbol table

A character table is a program built into Microsoft Windows that allows you to view the characters available for a selected font.

Using a symbol table, you can copy individual symbols or a group of symbols to the clipboard and paste them into any program that supports displaying those symbols. Opening the symbol table

In Windows 10, enter the word "symbol" in the search box on the taskbar and select the symbol table from the search results.

In Windows 8, type the word "symbol" on the Start screen and select the symbol table from the search results.

In Windows 7, click the Start button, select All Programs, Accessories, System Tools, and then click Character Map.

Characters are grouped by font. Click the font list to select the appropriate character set. To select a symbol, click it, then click the Select button. To insert a symbol, right-click the desired location in the document and select Paste.

Frequently used character codes

For a complete list of characters, see Computer, ASCII Character Code Table, or Unicode Character Tables Organized by Set.

Glyph

Currency

Legal symbols

Mathematical symbols

Fractions

Punctuation and dialect symbols

Shape symbols

Commonly used diacritics codes

For a complete list of glyphs and corresponding codes, see.

Glyph

Non-printing ASCII control characters

The characters used to control some peripheral devices, such as printers, are numbered 0–31 in the ASCII table. For example, the page feed/new page character is number 12. This character tells the printer to move to the beginning of the next page.

Table of non-printing ASCII control characters

Decimal number	Sign	Decimal number	Sign
		Freeing the data channel
Start of title		First device control code
Beginning of text		Second device control code
End of text		Third device control code
End of transmission		Fourth device control code
	five-pointed	Negative confirmation
Confirmation		Synchronous transmission mode
Beep		End of transmitted data block

Horizontal tabulation		End of media
Line feed/new line		Replacement symbol
Vertical tab			exceed
Page translation/new page	Twelve	File separator
Carriage return		Group separator
Shift without saving bits		Record separator
Bit-preserving shift	fifteen	Data separator

Dec	Hex	Symbol	Dec	Hex	Symbol
000	00	specialist. NOP	128	80	Ђ
001	01	specialist. SOH	129	81	Ѓ
002	02	specialist. STX	130	82	‚
003	03	specialist. ETX	131	83	ѓ
004	04	specialist. EOT	132	84	„
005	05	specialist. ENQ	133	85	…
006	06	specialist. ACK	134	86	†
007	07	specialist. BEL	135	87	‡
008	08	specialist. B.S.	136	88	€
009	09	specialist. TAB	137	89	‰
010	0A	specialist. LF	138	8A	Љ
011	0B	specialist. VT	139	8B	‹ ‹
012	0C	specialist. FF	140	8C	Њ
013	0D	specialist. CR	141	8D	Ќ
014	0E	specialist. SO	142	8E	Ћ
015	0F	specialist. S.I.	143	8F	Џ
016	10	specialist. DLE	144	90	ђ
017	11	specialist. DC1	145	91	‘
018	12	specialist. DC2	146	92	’
019	13	specialist. DC3	147	93	“
020	14	specialist. DC4	148	94	”
021	15	specialist. N.A.K.	149	95
022	16	specialist. SYN	150	96	–
023	17	specialist. ETB	151	97	—
024	18	specialist. CAN	152	98
025	19	specialist. E.M.	153	99	™
026	1A	specialist. SUB	154	9A	љ
027	1B	specialist. ESC	155	9B	›
028	1C	specialist. FS	156	9C	њ
029	1D	specialist. G.S.	157	9D	ќ
030	1E	specialist. R.S.	158	9E	ћ
031	1F	specialist. US	159	9F	џ
032	20	clutch SP (Space)	160	A0
033	21	!	161	A1	Ў
034	22	"	162	A2	ў
035	23	#	163	A3	Ћ
036	24	$	164	A4	¤
037	25	%	165	A5	Ґ
038	26	&	166	A6	¦
039	27	"	167	A7	§
040	28	(	168	A8	Yo
041	29	)	169	A9	©
042	2A	*	170	A.A.	Є
043	2B	+	171	AB	«
044	2C	,	172	A.C.	¬
045	2D	-	173	AD
046	2E	.	174	A.E.	®
047	2F	/	175	A.F.	Ї
048	30	0	176	B0	°
049	31	1	177	B1	±
050	32	2	178	B2	І
051	33	3	179	B3	і
052	34	4	180	B4	ґ
053	35	5	181	B5	µ
054	36	6	182	B6	¶
055	37	7	183	B7	·
056	38	8	184	B8	e
057	39	9	185	B9	№
058	3A	:	186	B.A.	є
059	3B	;	187	BB	»
060	3C	<	188	B.C.	ј
061	3D	=	189	BD	Ѕ
062	3E	>	190	BE	ѕ
063	3F	?	191	B.F.	ї
064	40	@	192	C0	A
065	41	A	193	C1	B
066	42	B	194	C2	IN
067	43	C	195	C3	G
068	44	D	196	C4	D
069	45	E	197	C5	E
070	46	F	198	C6	AND
071	47	G	199	C7	Z
072	48	H	200	C8	AND
073	49	I	201	C9	Y
074	4A	J	202	C.A.	TO
075	4B	K	203	C.B.	L
076	4C	L	204	CC	M
077	4D	M	205	CD	N
078	4E	N	206	C.E.	ABOUT
079	4F	O	207	CF	P
080	50	P	208	D0	R
081	51	Q	209	D1	WITH
082	52	R	210	D2	T
083	53	S	211	D3	U
084	54	T	212	D4	F
085	55	U	213	D5	X
086	56	V	214	D6	C
087	57	W	215	D7	H
088	58	X	216	D8	Sh
089	59	Y	217	D9	SCH
090	5A	Z	218	D.A.	Kommersant
091	5B	[	219	D.B.	Y
092	5C	\	220	DC	b
093	5D	]	221	DD	E
094	5E	^	222	DE	Yu
095	5F	_	223	DF	I
096	60	`	224	E0	A
097	61	a	225	E1	b
098	62	b	226	E2	V
099	63	c	227	E3	G
100	64	d	228	E4	d
101	65	e	229	E5	e
102	66	f	230	E6	and
103	67	g	231	E7	h
104	68	h	232	E8	And
105	69	i	233	E9	th
106	6A	j	234	E.A.	To
107	6B	k	235	E.B.	l
108	6C	l	236	E.C.	m
109	6D	m	237	ED	n
110	6E	n	238	E.E.	O
111	6F	o	239	E.F.	n
112	70	p	240	F0	r
113	71	q	241	F1	With
114	72	r	242	F2	T
115	73	s	243	F3	at
116	74	t	244	F4	f
117	75	u	245	F5	X
118	76	v	246	F6	ts
119	77	w	247	F7	h
120	78	x	248	F8	w
121	79	y	249	F9	sch
122	7A	z	250	F.A.	ъ
123	7B	{	251	FB	s
124	7C	\|	252	F.C.	b
125	7D	}	253	FD	uh
126	7E	~	254	F.E.	yu
127	7F	Specialist. DEL	255	FF	I

ASCII Windows character code table.
Description of special (control) characters It should be noted that initially control characters of the ASCII table were used to ensure data exchange via teletype, data entry from punched tape and for simple control of external devices.
Currently, most of the ASCII table control characters no longer carry this load and can be used for other purposes. Code Description

NUL, 00	Null, empty
SOH, 01	Start Of Heading
STX, 02	Start of TeXt, the beginning of the text.
ETX, 03	End of TeXt, end of text
EOT, 04	End of Transmission, end of transmission
ENQ, 05	Enquire. Please confirm
ACK, 06	Acknowledgment. I confirm
BEL, 07	Bell, call
BS, 08	Backspace, go back one character
TAB, 09	Tab, horizontal tab
LF, 0A	Line Feed, line feed. Nowadays in most programming languages it is denoted as \n
VT, 0B	Vertical Tab, vertical tabulation.
FF, 0C	Form Feed, page feed, new page
CR, 0D	Carriage Return, carriage return. Nowadays in most programming languages it is denoted as \r
SO,0E	Shift Out, change the color of the ink ribbon in the printing device
SI,0F	Shift In, return the color of the ink ribbon in the printing device back
DLE, 10	Data Link Escape, switching the channel to data transmission
DC1, 11 DC2, 12 DC3, 13 DC4, 14	Device Control, device control symbols
NAK, 15	Negative Acknowledgment, I do not confirm.
SYN, 16	Synchronization. Synchronization symbol
ETB, 17	End of Text Block, end of the text block
CAN, 18	Cancel, canceling previously transferred
EM, 19	End of Medium
SUB, 1A	Substitute, substitute. Placed in place of a symbol whose meaning was lost or corrupted during transmission
ESC, 1B	Escape Control Sequence
FS, 1C	File Separator, file separator
GS, 1D	Group Separator
RS, 1E	Record Separator, record separator
US, 1F	Unit Separator
DEL, 7F	Delete, erase the last character.

The set of characters with which text is written is called alphabet.

The number of characters in the alphabet is its power.

Formula for determining the amount of information: N=2b,

where N is the power of the alphabet (number of characters),

b – number of bits (information weight of the symbol).

The alphabet, with a capacity of 256 characters, can accommodate almost all the necessary characters. This alphabet is called sufficient.

Because 256 = 2 8, then the weight of 1 character is 8 bits.

The unit of measurement 8 bits was given the name 1 byte:

1 byte = 8 bits.

The binary code of each character in computer text takes up 1 byte of memory.

How is text information represented in computer memory?

The convenience of byte-by-byte character encoding is obvious because a byte is the smallest addressable part of memory and, therefore, the processor can access each character separately when processing text. On the other hand, 256 characters is quite a sufficient number to represent a wide variety of symbolic information.

Now the question arises, which eight-bit binary code to assign to each character.

It is clear that this is a conditional matter; you can come up with many encoding methods.

All characters of the computer alphabet are numbered from 0 to 255. Each number corresponds to an eight-bit binary code from 00000000 to 11111111. This code is simply the serial number of the character in the binary number system.

A table in which all characters of the computer alphabet are assigned serial numbers is called an encoding table.

Different types of computers use different encoding tables.

The table has become the international standard for PCs ASCII(read aski) (American Standard Code for Information Interchange).

The ASCII code table is divided into two parts.

Only the first half of the table is the international standard, i.e. symbols with numbers from 0 (00000000), up to 127 (01111111).

ASCII encoding table structure

Serial number	Code	Symbol
0 - 31	00000000 - 00011111	Symbols with numbers from 0 to 31 are usually called control symbols. Their function is to control the process of displaying text on the screen or printing, sounding a sound signal, marking up text, etc.
32 - 127	00100000 - 01111111	Standard part of the table (English). This includes lowercase and uppercase letters of the Latin alphabet, decimal numbers, punctuation marks, all kinds of parentheses, commercial and other symbols. Character 32 is a space, i.e. empty position in the text. All others are reflected in certain signs.
128 - 255	10000000 - 11111111	Alternative part of the table (Russian). The second half of the ASCII code table, called the code page (128 codes, starting from 10000000 and ending with 11111111), can have different options, each option has its own number. The code page is primarily used to accommodate national alphabets other than Latin. In Russian national encodings, characters from the Russian alphabet are placed in this part of the table.

First half of the ASCII code table

Please note that in the encoding table, letters (uppercase and lowercase) are arranged in alphabetical order, and numbers are ordered in ascending order. This observance of lexicographic order in the arrangement of symbols is called the principle of sequential coding of the alphabet.

For letters of the Russian alphabet, the principle of sequential coding is also observed.

Second half of the ASCII code table

Unfortunately, there are currently five different Cyrillic encodings (KOI8-R, Windows. MS-DOS, Macintosh and ISO). Because of this, problems often arise with transferring Russian text from one computer to another, from one software system to another.

Chronologically, one of the first standards for encoding Russian letters on computers was KOI8 ("Information Exchange Code, 8-bit"). This encoding was used back in the 70s on computers of the ES computer series, and from the mid-80s it began to be used in the first Russified versions of the UNIX operating system.

From the early 90s, the time of dominance of the MS DOS operating system, the CP866 encoding remains ("CP" means "Code Page", "code page").

Apple computers running the Mac OS operating system use their own Mac encoding.

In addition, the International Standards Organization (ISO) has approved another encoding called ISO 8859-5 as a standard for the Russian language.

The most common encoding currently used is Microsoft Windows, abbreviated CP1251.

Since the late 90s, the problem of standardizing character encoding has been solved by the introduction of a new international standard called Unicode. This is a 16-bit encoding, i.e. it allocates 2 bytes of memory for each character. Of course, this increases the amount of memory occupied by 2 times. But such a code table allows the inclusion of up to 65536 characters. The complete specification of the Unicode standard includes all the existing, extinct and artificially created alphabets of the world, as well as many mathematical, musical, chemical and other symbols.

Let's try using an ASCII table to imagine what words will look like in computer memory. Internal representation of words in computer memory

Sometimes it happens that a text consisting of letters of the Russian alphabet received from another computer cannot be read - some kind of “abracadabra” is visible on the monitor screen. This happens because computers use different character encodings for the Russian language.