<HTML ><HEAD ><TITLE >linux/arch/i386/boot/compressed/head.S</TITLE ><META NAME="GENERATOR" CONTENT="Modular DocBook HTML Stylesheet Version 1.7"><LINK REL="HOME" TITLE="Linux i386 Boot Code HOWTO" HREF="index.html"><LINK REL="PREVIOUS" TITLE="linux/arch/i386/boot/setup.S" HREF="setup.html"><LINK REL="NEXT" TITLE="linux/arch/i386/kernel/head.S" HREF="kernel_head.html"></HEAD ><BODY CLASS="sect1" BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#840084" ALINK="#0000FF" ><DIV CLASS="NAVHEADER" ><TABLE SUMMARY="Header navigation table" WIDTH="100%" BORDER="0" CELLPADDING="0" CELLSPACING="0" ><TR ><TH COLSPAN="3" ALIGN="center" >Linux i386 Boot Code HOWTO</TH ></TR ><TR ><TD WIDTH="10%" ALIGN="left" VALIGN="bottom" ><A HREF="setup.html" ACCESSKEY="P" >Prev</A ></TD ><TD WIDTH="80%" ALIGN="center" VALIGN="bottom" ></TD ><TD WIDTH="10%" ALIGN="right" VALIGN="bottom" ><A HREF="kernel_head.html" ACCESSKEY="N" >Next</A ></TD ></TR ></TABLE ><HR ALIGN="LEFT" WIDTH="100%"></DIV ><DIV CLASS="sect1" ><H1 CLASS="sect1" ><A NAME="compressed_head" ></A >5. linux/arch/i386/boot/compressed/head.S</H1 ><P > We are in <EM >bvmlinux</EM > now! With the help of <EM >misc.c:decompress_kernel()</EM >, we are going to decompress <EM >piggy.o</EM > to get the resident kernel image <TT CLASS="filename" >linux/vmlinux</TT >. </P ><P > This file is of pure 32-bit startup code. Unlike previous two files, it has no ".code16" statement in the source file. Refer to <A HREF="http://www.gnu.org/software/binutils/manual/gas-2.9.1/html_chapter/as_16.html#SEC205" TARGET="_top" > Using as: Writing 16-bit Code</A > for details. </P ><DIV CLASS="sect2" ><H2 CLASS="sect2" ><A NAME="decompress_kernel" ></A >5.1. Decompress Kernel</H2 ><P > The segment base addresses in segment descriptors (which correspond to segment selector __KERNEL_CS and __KERNEL_DS) are equal to 0; therefore, the logical address offset (in segment:offset format) will be equal to its linear address if either of these segment selectors is used. For <EM >zImage</EM >, CS:EIP is at logical address 10:1000 (linear address 0x1000) now; for <EM >bzImage</EM >, 10:100000 (linear address 0x100000). </P ><P > As paging is not enabled, linear address is identical to physical address. Check IA-32 Manual (Vol.1. Ch.3.3. Memory Organization, and Vol.3. Ch.3. Protected-Mode Memory Management) and <A HREF="http://www.xml.com/ldd/chapter/book/ch13.html#t1" TARGET="_top" > Linux Device Drivers: Memory Management in Linux</A > for address issue. </P ><P > It comes from <TT CLASS="filename" >setup.S</TT > that BX=0 and ESI=INITSEG<<4. </P ><P > <TABLE BORDER="0" BGCOLOR="#E0E0E0" WIDTH="100%" ><TR ><TD ><FONT COLOR="#000000" ><PRE CLASS="programlisting" >.text /////////////////////////////////////////////////////////////////////////////// startup_32() { cld; cli; DS = ES = FS = GS = __KERNEL_DS; SS:ESP = *stack_start; // end of user_stack[], defined in misc.c // all segment registers are reloaded after protected mode is enabled // check that A20 really IS enabled EAX = 0; do { 1: DS:[0] = ++EAX; } while (DS:[0x100000]==EAX); EFLAGS = 0; clear BSS; // from _edata to _end struct moveparams mp; // subl $16,%esp if (!decompress_kernel(&mp, ESI)) { // return value in AX restore ESI from stack; EBX = 0; goto __KERNEL_CS:100000; // see linux/arch/i386/kernel/head.S:startup_32 } /* * We come here, if we were loaded high. * We need to move the move-in-place routine down to 0x1000 * and then start it with the buffer addresses in registers, * which we got from the stack. */ 3: move move_rountine_start..move_routine_end to 0x1000; // move_routine_start & move_routine_end are defined below // prepare move_routine_start() parameters EBX = real mode pointer; // ESI value passed from setup.S ESI = mp.low_buffer_start; ECX = mp.lcount; EDX = mp.high_buffer_star; EAX = mp.hcount; EDI = 0x100000; cli; // make sure we don't get interrupted goto __KERNEL_CS:1000; // move_routine_start(); } /* Routine (template) for moving the decompressed kernel in place, * if we were high loaded. This _must_ PIC-code ! */ /////////////////////////////////////////////////////////////////////////////// move_routine_start() { move mp.low_buffer_start to 0x100000, mp.lcount bytes, in two steps: (lcount >> 2) words + (lcount & 3) bytes; move/append mp.high_buffer_start, ((mp.hcount + 3) >> 2) words // 1 word == 4 bytes, as I mean 32-bit code/data. ESI = EBX; // real mode pointer, as that from setup.S EBX = 0; goto __KERNEL_CS:100000; // see linux/arch/i386/kernel/head.S:startup_32() move_routine_end: }</PRE ></FONT ></TD ></TR ></TABLE > For the meaning of "je 1b" and "jnz 3f", refer to <A HREF="http://www.gnu.org/software/binutils/manual/gas-2.9.1/html_chapter/as_5.html#SEC48" TARGET="_top" > Using as: Local Symbol Names</A >. </P ><P > Didn't find <EM >_edata</EM > and <EM >_end</EM > definitions? No problem, they are defined in the "internal linker script". Without -T (--script=) option specified, <B CLASS="command" >ld</B > uses this builtin script to link <EM >compressed/bvmlinux</EM >. Use "<B CLASS="command" >ld --verbose</B >" to display this script, or check Appendix B. <A HREF="internel_lds.html" ><I >Internal Linker Script</I ></A >. </P ><P > Refer to <A HREF="http://www.gnu.org/software/binutils/manual/ld-2.9.1/html_chapter/ld_2.html#SEC3" TARGET="_top" > Using LD, the GNU linker: Command Line Options</A > for -T (--script=), -L (--library-path=) and --verbose option description. "<B CLASS="command" >man ld</B >" and "<B CLASS="command" >info ld</B >" may help too. </P ><P > <EM >piggy.o</EM > has been unzipped and control is passed to __KERNEL_CS:100000, i.e. <EM >linux/arch/i386/kernel/head.S:startup_32()</EM >. See <A HREF="kernel_head.html" >Section 6</A >. </P ><P > <TABLE BORDER="0" BGCOLOR="#E0E0E0" WIDTH="100%" ><TR ><TD ><FONT COLOR="#000000" ><PRE CLASS="programlisting" >#define LOW_BUFFER_START 0x2000 #define LOW_BUFFER_MAX 0x90000 #define HEAP_SIZE 0x3000 /////////////////////////////////////////////////////////////////////////////// asmlinkage int decompress_kernel(struct moveparams *mv, void *rmode) |-- setup real_mode(=rmode), vidmem, vidport, lines and cols; |-- if (is_zImage) setup_normal_output_buffer() { | output_data = 0x100000; | free_mem_end_ptr = real_mode; | } else (is_bzImage) setup_output_buffer_if_we_run_high(mv) { | output_data = LOW_BUFFER_START; | low_buffer_end = MIN(real_mode, LOW_BUFFER_MAX) & ~0xfff; | low_buffer_size = low_buffer_end - LOW_BUFFER_START; | free_mem_end_ptr = &end + HEAP_SIZE; | // get mv->low_buffer_start and mv->high_buffer_start | mv->low_buffer_start = LOW_BUFFER_START; | /* To make this program work, we must have | * high_buffer_start > &end+HEAP_SIZE; | * As we will move low_buffer from LOW_BUFFER_START to 0x100000 | * (max low_buffer_size bytes) finally, we should have | * high_buffer_start > 0x100000+low_buffer_size; */ | mv->high_buffer_start = high_buffer_start | = MAX(&end+HEAP_SIZE, 0x100000+low_buffer_size); | mv->hcount = 0 if (0x100000+low_buffer_size > &end+HEAP_SIZE); | = -1 if (0x100000+low_buffer_size <= &end+HEAP_SIZE); | /* mv->hcount==0 : we need not move high_buffer later, | * as it is already at 0x100000+low_buffer_size. | * Used by close_output_buffer_if_we_run_high() below. */ | } |-- makecrc(); // create crc_32_tab[] | puts("Uncompressing Linux... "); |-- gunzip(); | puts("Ok, booting the kernel.\n"); |-- if (is_bzImage) close_output_buffer_if_we_run_high(mv) { | // get mv->lcount and mv->hcount | if (bytes_out > low_buffer_size) { | mv->lcount = low_buffer_size; | if (mv->hcount) | mv->hcount = bytes_out - low_buffer_size; | } else { | mv->lcount = bytes_out; | mv->hcount = 0; | } | } `-- return is_bzImage; // return value in AX</PRE ></FONT ></TD ></TR ></TABLE > <EM >end</EM > is defined in the "internal linker script" too. </P ><P > <EM >decompress_kernel()</EM > has an "asmlinkage" modifer. In <TT CLASS="filename" >linux/include/linux/linkage.h</TT >: <TABLE BORDER="0" BGCOLOR="#E0E0E0" WIDTH="100%" ><TR ><TD ><FONT COLOR="#000000" ><PRE CLASS="programlisting" >#ifdef __cplusplus #define CPP_ASMLINKAGE extern "C" #else #define CPP_ASMLINKAGE #endif #if defined __i386__ #define asmlinkage CPP_ASMLINKAGE __attribute__((regparm(0))) #elif defined __ia64__ #define asmlinkage CPP_ASMLINKAGE __attribute__((syscall_linkage)) #else #define asmlinkage CPP_ASMLINKAGE #endif</PRE ></FONT ></TD ></TR ></TABLE > Macro "asmlinkage" will force the compiler to pass all function arguments on the stack, in case some optimization method may try to change this convention. Check <A HREF="http://gcc.gnu.org/onlinedocs/gcc-3.3.2/gcc/Function-Attributes.html#Function%20Attributes" TARGET="_top" >Using the GNU Compiler Collection (GCC): Declaring Attributes of Functions</A > (regparm) and <A HREF="http://kernelnewbies.org/faq/index.php3#asmlinkage" TARGET="_top" >Kernelnewbies FAQ: What is asmlinkage</A > for more details. </P ></DIV ><DIV CLASS="sect2" ><H2 CLASS="sect2" ><A NAME="gunzip" ></A >5.2. gunzip()</H2 ><P > <EM >decompress_kernel()</EM > calls <EM >gunzip() -> inflate()</EM >, which are defined in <TT CLASS="filename" >linux/lib/inflate.c</TT >, to decompress resident kernel image to low buffer (pointed by <EM >output_data</EM >) and high buffer (pointed by <EM >high_buffer_start</EM >, for <EM >bzImage</EM > only). </P ><P > The gzip file format is specified in <A HREF="http://www.ietf.org/rfc/rfc1952.txt" TARGET="_top" >RFC 1952</A >. <DIV CLASS="table" ><A NAME="AEN857" ></A ><P ><B >Table 6. gzip file format</B ></P ><TABLE BORDER="1" CLASS="CALSTABLE" ><THEAD ><TR ><TH ALIGN="LEFT" VALIGN="MIDDLE" >Component</TH ><TH ALIGN="LEFT" VALIGN="MIDDLE" >Meaning</TH ><TH ALIGN="LEFT" VALIGN="MIDDLE" >Byte</TH ><TH ALIGN="LEFT" VALIGN="MIDDLE" >Comment</TH ></TR ></THEAD ><TBODY ><TR ><TD ALIGN="LEFT" VALIGN="MIDDLE" >ID1</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >IDentification 1</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >1</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >31 (0x1f, \037)</TD ></TR ><TR ><TD ALIGN="LEFT" VALIGN="MIDDLE" >ID2</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >IDentification 2</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >1</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >139 (0x8b, \213) <A NAME="AEN877" HREF="#FTN.AEN877" ><SPAN CLASS="footnote" >[a]</SPAN ></A > </TD ></TR ><TR ><TD ALIGN="LEFT" VALIGN="MIDDLE" >CM</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >Compression Method</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >1</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >8 - denotes the "deflate" compression method</TD ></TR ><TR ><TD ALIGN="LEFT" VALIGN="MIDDLE" >FLG</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >FLaGs</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >1</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >0 for most cases</TD ></TR ><TR ><TD ALIGN="LEFT" VALIGN="MIDDLE" >MTIME</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >Modification TIME</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >4</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >modification time of the original file</TD ></TR ><TR ><TD ALIGN="LEFT" VALIGN="MIDDLE" >XFL</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >eXtra FLags</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >1</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >2 - compressor used maximum compression, slowest algorithm <A NAME="AEN899" HREF="#FTN.AEN899" ><SPAN CLASS="footnote" >[b]</SPAN ></A > </TD ></TR ><TR ><TD ALIGN="LEFT" VALIGN="MIDDLE" >OS</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >Operating System</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >1</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >3 - Unix</TD ></TR ><TR ><TD ALIGN="LEFT" VALIGN="MIDDLE" >extra fields</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >-</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >-</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >variable length, field indicated by FLG <A NAME="AEN911" HREF="#FTN.AEN911" ><SPAN CLASS="footnote" >[c]</SPAN ></A > </TD ></TR ><TR ><TD ALIGN="LEFT" VALIGN="MIDDLE" >compressed blocks</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >-</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >-</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >variable length</TD ></TR ><TR ><TD ALIGN="LEFT" VALIGN="MIDDLE" >CRC32</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >-</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >4</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >CRC value of the uncompressed data</TD ></TR ><TR ><TD ALIGN="LEFT" VALIGN="MIDDLE" >ISIZE</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >Input SIZE</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >4</TD ><TD ALIGN="LEFT" VALIGN="MIDDLE" >the size of the uncompressed input data modulo 2^32</TD ></TR ></TBODY ><TR ><TD COLSPAN="4" >Notes:<BR><A NAME="FTN.AEN877" >a. </A > ID2 value can be 158 (0x9e, \236) for gzip 0.5; <BR><A NAME="FTN.AEN899" >b. </A > XFL value 4 - compressor used fastest algorithm; <BR><A NAME="FTN.AEN911" >c. </A > FLG bit 0, FTEXT, does not indicate any "extra field". <BR></TD ></TR ></TABLE ></DIV > </P ><P > We can use this file format knowledge to find out the beginning of gzipped <TT CLASS="filename" >linux/vmlinux</TT >. <TABLE BORDER="0" BGCOLOR="#E0E0E0" WIDTH="100%" ><TR ><TD ><FONT COLOR="#000000" ><PRE CLASS="screen" ><B CLASS="command" >[root@localhost boot]# hexdump -C /boot/vmlinuz-2.4.20-28.9 | grep '1f 8b 08 00'</B > 00004c50 1f 8b 08 00 01 f6 e1 3f 02 03 ec 5d 7d 74 14 55 |.......?...]}t.U| <B CLASS="command" >[root@localhost boot]# hexdump -C /boot/vmlinuz-2.4.20-28.9 -s 0x4c40 -n 64</B > 00004c40 00 80 0b 00 00 fc 21 00 68 00 00 00 1e 01 11 00 |......!.h.......| 00004c50 1f 8b 08 00 01 f6 e1 3f 02 03 ec 5d 7d 74 14 55 |.......?...]}t.U| 00004c60 96 7f d5 a9 d0 1d 4d ac 56 93 35 ac 01 3a 9c 6a |......M.V.5..:.j| 00004c70 4d 46 5c d3 7b f8 48 36 c9 6c 84 f0 25 88 20 9f |MF\.{.H6.l..%. .| 00004c80 <B CLASS="command" >[root@localhost boot]# hexdump -C /boot/vmlinuz-2.4.20-28.9 | tail -n 4</B > 00114d40 bd 77 66 da ce 6f 3d d6 33 5c 14 a2 9f 7e fa e9 |.wf..o=.3\...~..| 00114d50 a7 9f 7e fa ff 57 3f 00 00 00 00 00 d8 bc ab ea |..~..W?.........| 00114d60 44 5d 76 d1 fd 03 33 58 c2 f0 00 51 27 00 |D]v...3X...Q'.| 00114d6e</PRE ></FONT ></TD ></TR ></TABLE > We can see that the gzipped file begins at 0x4c50 in the above example. The four bytes before "1f 8b 08 00" is <EM >input_len</EM > (0x0011011e, in little endian), and 0x4c50+0x0011011e=0x114d6e equals to the size of <EM >bzImage</EM > (<TT CLASS="filename" >/boot/vmlinuz-2.4.20-28.9</TT >). </P ><P > <TABLE BORDER="0" BGCOLOR="#E0E0E0" WIDTH="100%" ><TR ><TD ><FONT COLOR="#000000" ><PRE CLASS="programlisting" >static uch *inbuf; /* input buffer */ static unsigned insize = 0; /* valid bytes in inbuf */ static unsigned inptr = 0; /* index of next byte to be processed in inbuf */ /////////////////////////////////////////////////////////////////////////////// static int gunzip(void) { Check input buffer for {ID1, ID2, CM}, must be {0x1f, 0x8b, 0x08} (normal case), or {0x1f, 0x9e, 0x08} (for gzip 0.5); Check FLG (flag byte), must not set bit 1, 5, 6 and 7; Ignore {MTIME, XFL, OS}; Handle optional structures, which correspond to FLG bit 2, 3 and 4; inflate(); // handle compressed blocks Validate {CRC32, ISIZE}; }</PRE ></FONT ></TD ></TR ></TABLE > When <EM >get_byte()</EM >, defined in <TT CLASS="filename" >linux/arch/i386/boot/compressed/misc.c</TT >, is called for the first time, it calls <EM >fill_inbuf()</EM > to setup input buffer <EM >inbuf=input_data</EM > and <EM >insize=input_len</EM >. Symbol <EM >input_data</EM > and <EM >input_len</EM > are defined in <EM >piggy.o</EM > linker script. See <A HREF="makefiles.html#i386_boot_compressed_makefile" >Section 2.5</A >. </P ></DIV ><DIV CLASS="sect2" ><H2 CLASS="sect2" ><A NAME="inflate" ></A >5.3. inflate()</H2 ><P > <TABLE BORDER="0" BGCOLOR="#E0E0E0" WIDTH="100%" ><TR ><TD ><FONT COLOR="#000000" ><PRE CLASS="programlisting" >// some important definitions in misc.c #define WSIZE 0x8000 /* Window size must be at least 32k, * and a power of two */ static uch window[WSIZE]; /* Sliding window buffer */ static unsigned outcnt = 0; /* bytes in output buffer */ // linux/lib/inflate.c #define wp outcnt #define flush_output(w) (wp=(w),flush_window()) STATIC unsigned long bb; /* bit buffer */ STATIC unsigned bk; /* bits in bit buffer */ STATIC unsigned hufts; /* track memory usage */ static long free_mem_ptr = (long)&end; /////////////////////////////////////////////////////////////////////////////// STATIC int inflate() { int e; /* last block flag */ int r; /* result code */ unsigned h; /* maximum struct huft's malloc'ed */ void *ptr; wp = bb = bk = 0; // inflate compressed blocks one by one do { hufts = 0; gzip_mark() { ptr = free_mem_ptr; }; if ((r = inflate_block(&e)) != 0) { gzip_release() { free_mem_ptr = ptr; }; return r; } gzip_release() { free_mem_ptr = ptr; }; if (hufts > h) h = hufts; } while (!e); /* Undo too much lookahead. The next read will be byte aligned so we * can discard unused bits in the last meaningful byte. */ while (bk >= 8) { bk -= 8; inptr--; } /* write the output window window[0..outcnt-1] to output_data, * update output_ptr/output_data, crc and bytes_out accordingly, and * reset outcnt to 0. */ flush_output(wp); /* return success */ return 0; }</PRE ></FONT ></TD ></TR ></TABLE > <EM >free_mem_ptr</EM > is used in <EM >misc.c:malloc()</EM > for dynamic memory allocation. Before inflating each compressed block, <EM >gzip_mark()</EM > saves the value of <EM >free_mem_ptr</EM >; After inflation, <EM >gzip_release()</EM > will restore this value. This is how it "<EM >free()</EM >" the memory allocated in <EM >inflate_block()</EM >. </P ><P > <A HREF="http://www.gzip.org" TARGET="_top" >Gzip</A > uses Lempel-Ziv coding (LZ77) to compress files. The compressed data format is specified in <A HREF="http://www.ietf.org/rfc/rfc1951.txt" TARGET="_top" >RFC 1951</A >. <EM >inflate_block()</EM > will inflate compressed blocks, which can be treated as a bit sequence. </P ><P > The data structure of each compressed block is outlined below: <TABLE BORDER="0" BGCOLOR="#E0E0E0" WIDTH="100%" ><TR ><TD ><FONT COLOR="#000000" ><PRE CLASS="programlisting" >BFINAL (1 bit) 0 - not the last block 1 - the last block BTYPE (2 bits) 00 - no compression remaining bits until the byte boundary; LEN (2 bytes); NLEN (2 bytes, the one's complement of LEN); data (LEN bytes); 01 - compressed with fixed Huffman codes { literal (7-9 bits, represent code 0..287, excluding 256); // See RFC 1951, table in Paragraph 3.2.6. length (0-5 bits if literal > 256, represent length 3..258); // See RFC 1951, 1st alphabet table in Paragraph 3.2.5. data (of literal bytes if literal < 256); distance (5 plus 0-13 extra bits if literal == 257..285, represent distance 1..32768); /* See RFC 1951, 2nd alphabet table in Paragraph 3.2.5, * but statement in Paragraph 3.2.6. */ /* Move backward "distance" bytes in the output stream, * and copy "length" bytes */ }* // can be of multiple instances literal (7 bits, all 0, literal == 256, means end of block); 10 - compressed with dynamic Huffman codes HLIT (5 bits, # of Literal/Length codes - 257, 257-286); HDIST (5 bits, # of Distance codes - 1, 1-32); HCLEN (4 bits, # of Code Length codes - 4, 4 - 19); Code Length sequence ((HCLEN+4)*3 bits) /* The following two alphabet tables will be decoded using * the Huffman decoding table which is generated from * the preceeding Code Length sequence. */ Literal/Length alphabet (HLIT+257 codes) Distance alphabet (HDIST+1 codes) // Decoding tables will be built from these alphpabet tables. /* The following is similar to that of fixed Huffman codes portion, * except that they use different decoding tables. */ { literal/length (variable length, depending on Literal/Length alphabet); data (of literal bytes if literal < 256); distance (variable length if literal == 257..285, depending on Distance alphabet); }* // can be of multiple instances literal (literal value 256, which means end of block); 11 - reserved (error)</PRE ></FONT ></TD ></TR ></TABLE > Note that data elements are packed into bytes starting from Least-Significant Bit (LSB) to Most-Significant Bit (MSB), while Huffman codes are packed starting with MSB. Also note that <EM >literal</EM > value 286-287 and <EM >distance</EM > codes 30-31 will never actually occur. </P ><P > With the above data structure in mind and RFC 1951 by hand, it is not too hard to understand <EM >inflate_block()</EM >. Refer to related paragraphs in RFC 1951 for Huffman coding and alphabet table generation. </P ><P > For more details, refer to <TT CLASS="filename" >linux/lib/inflate.c</TT >, gzip source code (many in-line comments) and related reference materials. </P ></DIV ><DIV CLASS="sect2" ><H2 CLASS="sect2" ><A NAME="chead_ref" ></A >5.4. Reference</H2 ><P > <P ></P ><UL ><LI ><P > <A HREF="http://www.gnu.org/software/binutils/manual/gas-2.9.1/" TARGET="_top" > Using as</A > </P ></LI ><LI ><P > <A HREF="http://www.gnu.org/software/binutils/manual/ld-2.9.1/" TARGET="_top" > Using LD, the GNU linker</A > </P ></LI ><LI ><P ><A HREF="http://developer.intel.com/design/pentium4/manuals/" TARGET="_top" > IA-32 Intel Architecture Software Developer's Manual</A ></P ></LI ><LI ><P ><A HREF="http://www.gzip.org" TARGET="_top" > The gzip home page</A ></P ></LI ><LI ><P ><A HREF="http://freshmeat.net/projects/gzip" TARGET="_top" > gzip (freshmeat.net)</A ></P ></LI ><LI ><P > <A HREF="http://www.ietf.org/rfc/rfc1951.txt" TARGET="_top" > RFC 1951: DEFLATE Compressed Data Format Specification version 1.3 </A > </P ></LI ><LI ><P > <A HREF="http://www.ietf.org/rfc/rfc1952.txt" TARGET="_top" > RFC 1952: GZIP file format specification version 4.3 </A > </P ></LI ></UL > </P ></DIV ></DIV ><DIV CLASS="NAVFOOTER" ><HR ALIGN="LEFT" WIDTH="100%"><TABLE SUMMARY="Footer navigation table" WIDTH="100%" BORDER="0" CELLPADDING="0" CELLSPACING="0" ><TR ><TD WIDTH="33%" ALIGN="left" VALIGN="top" ><A HREF="setup.html" ACCESSKEY="P" >Prev</A ></TD ><TD WIDTH="34%" ALIGN="center" VALIGN="top" ><A HREF="index.html" ACCESSKEY="H" >Home</A ></TD ><TD WIDTH="33%" ALIGN="right" VALIGN="top" ><A HREF="kernel_head.html" ACCESSKEY="N" >Next</A ></TD ></TR ><TR ><TD WIDTH="33%" ALIGN="left" VALIGN="top" >linux/arch/i386/boot/setup.S</TD ><TD WIDTH="34%" ALIGN="center" VALIGN="top" > </TD ><TD WIDTH="33%" ALIGN="right" VALIGN="top" >linux/arch/i386/kernel/head.S</TD ></TR ></TABLE ></DIV ></BODY ></HTML >