Protocols — Compression Algorithm Specification
Version 1.10 12/01/02 17-5
The Block Header is composed of the tightly packed (no padding bits) fields described in
Table 17-1.
Table 17-1. Block Header Fields
Field Name Length (bits) Description
Block Size 16 The size of this Block. Block Size is defined as the number of original
characters plus the number of pointers that appear in the Block Body:
Block Size = Number of Original Characters in the Block Body +
Number of Pointers in the Block Body.
Extra Set Code
Length Array
Size
5 The number of code lengths in the Extra Set Code Length Array. The
Extra Set Code Length Array contains code lengths of the Extra Set in
increasing order of the symbols, and if all symbols greater than a
certain symbol have zero code length, the Extra Set Code Length
Array terminates at the last nonzero code length symbol. Since there
are 19 symbols in the Extra Set (see the description of the Char&Len
Set Code Length Array), the maximum Extra Set Code Length Array
Size is 19.
Extra Set Code
Length Array
Variable If Extra Set Code Length Array Size is 0, then this field is a 5-bit value
that represents the only Huffman code used.
If Extra Set Code Length Array Size is not 0, then this field is an
encoded form of a concatenation of code lengths in increasing order of
the symbols.
The concatenation of Code lengths are encoded as follows:
If a code length is less than 7, then it is encoded as a 3-bit value;
If a code length is equal to or greater than 7, then it is encoded as a
series of “1”s followed by a terminating “0.” The number of “1”s =
Code length – 4. For example, code length “ten” is encoded as
“1111110”; code length “seven” is encoded as “1110.”
After the third length of the code length concatenation, a 2-bit value is
used to indicate the number of consecutive zero lengths immediately
after the third length. (Note this 2-bit value only appears once after the
third length, and does NOT appear multiple times after every 3
rd
length.) This 2-bit value ranges from 0 to 3. For example, if the 2-bit
value is “00,” then it means there are no zero lengths at the point, and
following encoding starts from the fourth code length; if the 2-bit value
is “10” then it means the fourth and fifth length are zero and following
encoding starts from the sixth code length.
Position Set
Code Length
Array Size
4 The number of code lengths in the Position Set Code Length Array.
The Position Set Code Length Array contains code lengths of Position
Set in increasing order of the symbols in the Position Set, and if all
symbols greater than a certain symbol have zero code length, the
Position Set Code Length Array terminates at the last nonzero code
length symbol. Since there are 14 symbols in the Position Set (see
3.3.2), the maximum Position Set Code Length Array Size is 14.
continued