http://stackoverflow.com/a/7048780
Wikipedia explains both reasonably well: UTF-8 vs Latin-1 (ISO-8859-1). Former is a variable-length encoding, latter single-byte fixed length encoding.
Latin-1 encodes just the first 256 code points of the Unicode character set, UTF-8 can be used to encode all code points.
At physical encoding level, only codepoints 0 - 127 get encoded identically; code points 128 - 255 differ by
- becoming 2-byte sequence with UTF-8
- are single bytes with Latin-1.
No comments:
Post a Comment