请输入您要查询的英文单词:

 

单词 utf-8
释义 DictionarySeeASCII

UTF-8


UTF-8

(character)(UCS transformation format 8) AnASCII-compatible multibyte Unicode and UCS encoding,used by Java and Plan 9.

The Unicode character set occupies a 16-bit code space. Themost obvious Unicode encoding (known as UCS-2) consists of asequence of 16-bit words. Such strings can contain bytes like'\\0' or '/' which have a special meaning in filenames andother C library function parameters. In addition, themajority of Unix tools expects ASCII files and can't read16-bit words as characters without major modifications. Forthese reasons, UCS-2 is not a suitable external encoding ofUnicode in filenames, text files, environment variables, etc.

The ISO 10646 Universal Character Set (UCS), a superset ofUnicode, occupies a 31-bit code space and the obvious UCS-4encoding for it (a sequence of 32-bit words) has the sameproblems.

The UTF-8 encoding of Unicode and UCS avoids the problems offixed-length Unicode encodings because an ASCII file encodedin UTF is exactly same as the original ASCII file and allnon-ASCII characters are guaranteed to have the mostsignificant bit set (bit 0x80). This means that normal toolsfor text searching etc. work as expected.

UTF-8 is defined in RFC 2279.

["File System Safe UCS Transformation Format (FSS_UTF)",X/Open Preliminary Specification, X/Open Company Ltd.,Document Number: P316. This information also appears inISO/IEC 10646, Annex P].

Plan 9 UTF manual entry.

UTF-8

(Unicode Transformation Format-8) A format in the Unicode coding system that uses from one to four bytes. When coding the English language, only one byte is used per character like regular ASCII encoding. See Unicode and ASCII.

UTF-8


AcronymDefinition
UTF-8Universal Transformation Format-8 (character encoding)
随便看

 

英语词典包含2567994条英英释义在线翻译词条,基本涵盖了全部常用单词的英英翻译及用法,是英语学习的有利工具。

 

Copyright © 2004-2022 Newdu.com All Rights Reserved
更新时间:2025/3/21 1:11:09