Collation internals
342
: ’ ’
: _
: \xF2
: \xEE
: \xF0
: -
: ’,’
: ;
: ’:’
: !
% Sort some letters in alphabetical order
: A a A
: a a A
: B b B
: b b B
% Sort some E’s from code page 850,
% including some accented extended characters:
: e e E, \x82 \x82 \x90, \x8A \x8A \xD4
: E e E, \x90 \x82 \x90, \xD4 \x8A \xD4
Other syntax notes
For databases using case-insensitive sorting and comparison (that is, CASE
IGNORE was specified when the database was created), the lowercase and
uppercase mappings are used to find the lowercase and uppercase characters
that will be sorted together.
For multibyte character sets, the first byte of a character is listed in the collation
sequence, and all characters with the same first byte are sorted together, and
ordered according to the value of the following bytes. For example, the
following is part of the Shift-JIS collation file:
: \xfb
: \xfc
: \xfd
In this collation, all characters with first byte \xfc come after all characters with
first byte \xfb and before all characters with first byte \xfd. The two-byte
character \xfc \x01 would be ordered before the two-byte character \xfc \x02.
Any characters omitted from the collation are added to the end of the collation.
The tool that processes the collation file issues a warning.
The Encodings section
The Encodings section is optional, and follows the collation sequence. It is not
useful for single-byte character sets.