LZH format
Byte Order: Little-endian
There are some types of LZH header, level-0, level-1, and level-2.
level-0 level-1, level-2
+------------+ +------------+
| LZH header | | LZH header |
+------------+ +------------+
| compressed | | extension |
| data | | header |
+------------+ +------------+
| LZH header | | extension |
+------------+ | header |
| compressed | +------------+
| data | | ... |
+------------+ +------------+
... | compressed |
| data |
+------------+
| LZH header |
+------------+
| extension |
| header |
+------------+
| extension |
| header |
+------------+
| ... |
+------------+
| compressed |
| data |
+------------+
...
level-0
Offset Length Contents
0 1 byte Size of archived file header (h)
1 1 byte Header checksum
2 5 bytes Method ID
7 4 bytes Compressed size (n)
11 4 bytes Uncompressed size
15 4 bytes Original file date/time (Generic time stamp)
19 1 byte File attribute
20 1 byte Level (0x00)
21 1 byte Filename / path length in bytes (f)
22 (f)bytes Filename / path
22+(f) 2 bytes CRC-16 of original file
24+(f) (n)bytes Compressed data
level-1
Offset Length Contents
0 1 byte Size of archived file header (h)
1 1 byte Header checksum
2 5 bytes Method ID
7 4 bytes Compressed size (n)
11 4 bytes Uncompressed size
15 4 bytes Original file date/time (Generic time stamp)
19 1 byte 0x20
20 1 byte Level (0x01)
21 1 byte Filename / path length in bytes (f)
22 (f)bytes Filename / path
22+(f) 2 bytes CRC-16 of original file
24+(f) 1 byte OS ID
25+(f) 2 bytes Next header size(x) (0 means no extension header)
[ // Extension headers
1 byte Extension type
(x)-3 bytes Extension fields
2 bytes Next header size(x) (0 means no next extension header)
]*
(n)bytes Compressed data
level-2
Offset Length Contents
0 2 byte Total size of archived file header (h)
2 5 bytes Method ID
7 4 bytes Compressed size (n)
11 4 bytes Uncompressed size
15 4 bytes Original file time stamp(UNIX type, seconds since 1970)
19 1 byte Reserved
20 1 byte Level (0x02)
21 2 bytes CRC-16 of original file
23 1 byte OS ID
24 2 bytes Next header size(x) (0 means no extension header)
[
1 byte Extension type
(x)-3 bytes Extension fields
2 bytes Next header size(x) (0 means no next extension header)
]*
(n)bytes Compressed data
Extension header
Common header:
1 byte Extension type (0x00)
2 bytes CRC-16 of header
[1 bytes Information] (Optional)
2 bytes Next header size
File name header:
1 byte Extension type (0x01)
? bytes File name
2 bytes Next header size
Directory name header:
1 byte Extension type (0x02)
? bytes Directory name
2 bytes Next header size
Comment header:
1 byte Extension type (0x3f)
? bytes Comments
2 bytes Next header size
MS-DOS attribute header:
if (OS ID == EXTEND_MSDOS ||
OS ID == EXTEND_HUMAN ||
OS ID == EXTEND_GENERIC)
1 byte Extension type (0x40)
2 bytes Attr
2 bytes Next header size
UNIX file permission:
if (OS ID == EXTEND_UNIX)
1 byte Extension type (0x50)
2 bytes Permission value
2 bytes Next header size
UNIX file group/user ID:
if (OS ID == EXTEND_UNIX)
1 byte Extension type (0x51)
2 bytes Group ID
2 bytes User ID
2 bytes Next header size
UNIX file group name:
1 byte Extension type (0x52)
? bytes Group name
2 bytes Next header size
UNIX file user name:
1 byte Extension type (0x53)
? bytes User name
2 bytes Next header size
UNIX file last modified time:
if (OS ID == EXTEND_UNIX)
1 byte Extension type (0x54)
4 bytes Last modified time in UNIX time
2 bytes Next header size
Method ID
"-lh0-"
| No compression
|
"-lh1-"
| 4k sliding dictionary(max 60 bytes) + dynamic Huffman + fixed encoding of position
|
"-lh2-"
| 8k sliding dictionary(max 256 bytes) + dynamic Huffman (Obsoleted)
|
"-lh3-"
| 8k sliding dictionary(max 256 bytes) + static Huffman (Obsoleted)
|
"-lh4-"
| 4k sliding dictionary(max 256 bytes) + static Huffman + improved encoding of position and trees
|
"-lh5-"
| 8k sliding dictionary(max 256 bytes) + static Huffman + improved encoding of position and trees
|
"-lh6-"
| 32k sliding dictionary(max 256 bytes) + static Huffman + improved encoding of position and trees
|
"-lh7-"
| 64k sliding dictionary(max 256 bytes) + static Huffman + improved encoding of position and trees
|
"-lzs-"
| 2k sliding dictionary(max 17 bytes)
|
"-lz4-"
| No compression
|
"-lz5-"
| 4k sliding dictionary(max 17 bytes)
|
"-lhd-"
| Directory (no compressed data)
|
Generic time stamp:
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
|<-------- year ------->|<- month ->|<-- day -->|
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
|<--- hour --->|<---- minute --->|<- second/2 ->|
Offset Length Contents
0 8 bits year years since 1980
8 4 bits month [1..12]
12 4 bits day [1..31]
16 5 bits hour [0..23]
21 6 bits minite [0..59]
27 5 bits second/2 [0..29]
NOTE: If you don't have `gettimeofday(2)', or your gettimeofday(2) returns bogus timezone information, try FTIME, MKTIME, TIMELOCAL or TZSET.
OS ID
'M'
| MS-DOS
|
'2'
| OS/2
|
'9'
| OS9
|
'K'
| OS/68K
|
'3'
| OS/386
|
'H'
| HUMAN
|
'U'
| UNIX
|
'C'
| CP/M
|
'F'
| FLEX
|
'm'
| Mac
|
'R'
| Runser
|
Calc CRC16
#define CHAR_BIT 8
#define UCHAR_MAX ((1<<(sizeof(unsigned char)*8))-1)
#define CRCPOLY 0xA001 /* CRC-16 */
static WORD crctable[UCHAR_MAX + 1];
void make_crctable(void)
{
unsigned int i, j, r;
for (i = 0; i <= UCHAR_MAX; i++)
{
r = i;
for (j = 0; j < CHAR_BIT; j++)
if (r & 1)
r = (r >> 1) ^ CRCPOLY;
else
r >>= 1;
crctable[i] = r;
}
}
WORD calc_header_crc(BYTE *p, WORD n)
{
crc = 0;
while (n-- > 0)
crc = crctable[(crc ^ (*p++)) & 0xFF] ^ (crc >> CHAR_BIT)
return crc;
}
WORD calccrc(BYTE *p,WORD n)
{
reading_size += n;
while (n-- > 0)
crc = crctable[(crc ^ (*p++)) & 0xFF] ^ (crc >> CHAR_BIT)
return crc;
}
Original: http://www.goice.co.jp/member/mo/formats/lzh.html
LHA: http://gnuwin32.sourceforge.net/packages/lha.htm