Protocols — Compression Algorithm Specification
Version 1.10 12/01/02 17-11
17.3.2.2 Searching the Tree
Traversing the search tree is performed as follows:
The following example uses the search tree shown in Figure 17-5 above. Assume that the current
position in the source data contains the string “camxrsxpj….”
1. The starting character “c” is used to find the root of the tree. The next character “a” is used to
follow the edge from node 1 to node 2. The “position” of node 2 is 500, so a string starting
with “ca” can be found at position 500. The string at the current position is compared with the
string starting at position 500.
2. Node 2 is at Level 3; so at most three characters are compared. Assume that the three-character
comparison passes.
3. The fourth character “x” is used to follow the edge from Node 2 to Node 5. The position value
of node 5 is 400, which means there is a string located in position 400 that starts with “cam”
and the character at position 403 is an “x.”
4. Node 5 is at Level 8, so the fifth to eighth characters of the source data are compared with the
string starting at position 404. Assume the strings match.
5. At this point, the ninth character “p” has been reached. It is used to follow the edge from
Node 5 to Node 7.
6. This process continues until a mismatch occurs, or the length of the matching strings exceeds
the predefined MAX_MATCH_LENGTH. The most recent matching string (which is also the
longest) is the desired matching string.
17.3.2.3 Adding String Info
String info needs to be added to the String Info Log for each position in the source data. Each time
a search for a matching string is performed, the new string info is inserted for the current position.
There are several cases that can be discussed:
1. No root is found for the first character. A new tree is created with the root node labeled with
the starting character and a child leaf node with its edge to the root node labeled with the
second character in the string. The “position” value of the child node is set to the current
position.
2. One root node matches the first character, but the second character does not match any edge
extending from the root node. A new child leaf node is created with its edge labeled with the
second character. The “position” value of the new leaf child node is set to the current position.
3. A string comparison succeeds with an internal node, but a matching edge for the next character
does not exist. This is similar to (2) above. A new child leaf node is created with its edge
labeled with the character that does not exist. The “position” value of the new leaf child node
is set to the current position.
4. A string comparison exceeds MAX_MATCH_LENGTH. Note: This only happens with leaf
nodes. For this case, the “position” value in the leaf node is updated with the current position.