Support User Manuals

Intel Extensible Firmware Interface Network Router User Manual

Open as PDF

of 1084

Version 1.10 12/01/02 17-1

17

Protocols - Compression Algorithm

Specification

In EFI firmware storage, binary codes/data are often compressed to save storage space. These

compressed codes/data are extracted into memory for execution at boot time. This demands an

efficient lossless compression/decompression algorithm. The compressor must produce small

compressed images, and the decompressor must operate fast enough to avoid delays at boot time.

This chapter describes in detail the EFI compression/decompression algorithm, as well as the EFI

Decompress Protocol. The EFI Decompress Protocol provides a standard decompression interface

for use at boot time.

17.1 Algorithm Overview

In this chapter the term “character” denotes a single byte and the term “string” denotes a series of

concatenated characters.

The compression/decompression algorithm used in EFI firmware storage is a combination of the

LZ77 algorithm and Huffman Coding. The LZ77 algorithm replaces a repeated string with a

pointer to the previous occurrence of the string. Huffman Coding encodes symbols in a way that

the more frequently a symbol appears in a text, the shorter the code that is assigned to it.

The compression process contains two steps:

• The first step is to find repeated strings (using LZ77 algorithm) and produce intermediate data.

Beginning with the first character, the compressor scans the source data and determines if the

characters starting at the current position can form a string previously appearing in the text. If

a long enough matching string is found, the compressor will output a pointer to the string. If

the pointer occupies more space than the string itself, the compressor will output the original

character at the current position in the source data. Then the compressor advances to the next

position and repeats the process. To speed up the compression process, the compressor

dynamically maintains a String Info Log to record the positions and lengths of strings

encountered, so that string comparisons are performed quickly by looking up the String

Info Log.

Because a compressor cannot have unlimited resources, as the compression continues the

compressor removes “old” string information. This prevents the String Info Log from

becoming too large. As a result, the algorithm can only look up repeated strings within the

range of a fixed-sized “sliding window” behind the current position.

In this way, a stream of intermediate data is produced which contains two types of symbols:

the Original Characters (to be preserved in the decompressed data), and the Pointers

(representing a previous string). A Pointer consists of two elements: the String Position and

the String Length, representing the location and the length of the target string, respectively.

previous next