Sybase 12.4.2 Server User Manual


 
CHAPTER 9 International Languages and Character Sets
343
The Encodings section lists which characters are lead-bytes, for multi-byte
character sets, and what are valid follow-bytes.
For example, the Shift-JIS Encodings section is as follows:
Encodings:
[\x00-\x80,\xa0-\xdf,\xf0-\xff]
[\x81-\x9f,\xe0-\xef][\x40-\x7e,\x80-\xfc]
The first line following the section title lists valid single-byte characters. The
square brackets enclose a comma-separated list of ranges. Each range is listed
as a hyphen-separated pair of values. In the Shift-JIS collation, values \x00 to
\x80 are valid single-byte characters, but \x81 is not a valid single-byte
character.
The second line following the section title lists valid multibyte characters. Any
combination of one byte from the second line followed by one byte from the
first is a valid character. Therefore \x81\x40 is a valid double-byte character,
but \x81 \x00 is not.
The Properties section
The Properties section is optional, and follows the Encodings section.
If a Properties section is supplied, an Encodings section must be supplied also.
The Properties section lists values for the first-byte of each character that
represent alphabetic characters, digits, or spaces.
The Shift-JIS Properties section is as follows:
Properties:
space: [\x09-\x0d,\x20]
digit: [\x30-\x39]
alpha: [\x41-\x5a,\x61-\x7a,\x81-\x9f,\xe0-\xef]
This indicates that characters with first bytes \x09 to \x0d, as well as \x20, are
to be treated as space characters, digits are found in the range \x30 to \x39
inclusive, and alphabetic characters in the four ranges \x41-\x5a, \x61-\x7a,
\x81-\x9f, and \xe0-\xef.