This is a look into how certain EXEPACK-related programs
handle the min_extra_paragraphs field in the
EXE header.
This field is also known as e_minalloc
in IMAGE_DOS_HEADER terms.
Last updated:
I found a copy of EXEPACK.EXE at the PCjs page for Microsoft Macro Assembler 4.00. There are various other versions available. Select the dropdown for the B: drive, ensure MS Macro Assembler 4.00 is selected, click Load, then Save.
WinWorld is another source for disk images. The Internet Archive has a "Microsoft MASM 4 beta", whose differences from 4.00 I did not examine thoroughly.
Use Mtools to examine and extract the disk image:
$ mdir -i MASM-016014-400.img
Volume in drive : has no label
Directory for ::/
MASM EXE 85566 1985-10-16 4:00
LINK EXE 43988 1985-10-16 4:00
SYMDEB EXE 37021 1985-10-16 4:00
MAPSYM EXE 18026 1985-10-16 4:00
CREF EXE 15028 1985-10-16 4:00
LIB EXE 28716 1985-10-16 4:00
MAKE EXE 24300 1985-10-16 4:00
EXEPACK EXE 10848 1985-10-16 4:00
EXEMOD EXE 11034 1985-10-16 4:00
COUNT ASM 5965 1985-10-16 4:00
README DOC 7630 1985-10-16 4:00
11 files 288 122 bytes
69 632 bytes free
$ mkdir MASM-016014-400
$ cd MASM-016014-400
$ mcopy -i ../MASM-016014-400.img -s ::/ ./
Inside DOSBox or similar, you can run the programs and see the version numbers.
C:\>MASM.EXE Microsoft (R) Macro Assembler Version 4.00 Copyright (C) Microsoft Corp 1981, 1983, 1984, 1985. All rights reserved. C:\>LINK.EXE Microsoft (R) 8086 Object Linker Version 3.05 Copyright (C) Microsoft Corp 1983, 1984, 1985. All rights reserved. C:\>EXEPACK.EXE Microsoft (R) EXE File Compression Utility Version 4.00 Copyright (C) Microsoft Corp 1985. All rights reserved.
The disk image conveniently comes with a sample program, COUNT.ASM. Let's EXEPACK-compress it two ways, using EXEPACK.EXE and the /EXEPACK option to LINK.EXE.
C:\>MASM.EXE COUNT.ASM,COUNT.OBJ; C:\>LINK.EXE COUNT.OBJ,COUNT.EXE; C:\>EXEPACK.EXE COUNT.EXE COUNTE.EXE C:\>LINK.EXE /EXEPACK COUNT.OBJ,COUNTL.EXE;
The two compressed files are not identical. Using Rabin2 and Radiff2, we see that there are only trivial differences:
ret instruction).$ du -b COUNT*.EXE 3081 COUNT.EXE 1092 COUNTE.EXE 1092 COUNTL.EXE $ sha256sum COUNT*.EXE 10e86814a369a9cf12e7d0ea6930fdf3184692e4cdddae7627aea9ba0add4624 COUNT.EXE ab629d01a7e99e20153b6dd85c87f5adb9fa211c4daa2a6cc67cc12772973ba1 COUNTE.EXE 548bc5075fc8e98acf2f53e903619ca7e9595a543618fb9a75b45621743bf1b5 COUNTL.EXE $ rabin2 -H COUNT.EXE [0000:0000] Signature MZ [0000:0002] BytesInLastBlock 0x0009 [0000:0004] BlocksInFile 0x0007 [0000:0006] NumRelocs 0x0001 [0000:0008] HeaderParagraphs 0x0020 [0000:000a] MinExtraParagraphs 0x0000 [0000:000c] MaxExtraParagraphs 0xffff [0000:000e] InitialSs 0x0000 [0000:0010] InitialSp 0x0100 [0000:0012] Checksum 0xfdf4 [0000:0014] InitialIp 0x000c [0000:0016] InitialCs 0x0094 [0000:0018] RelocTableOffset 0x001e [0000:001a] OverlayNumber 0x0000 $ rabin2 -H COUNTE.EXE [0000:0000] Signature MZ [0000:0002] BytesInLastBlock 0x0044 [0000:0004] BlocksInFile 0x0003 [0000:0006] NumRelocs 0x0000 [0000:0008] HeaderParagraphs 0x0020 [0000:000a] MinExtraParagraphs 0x0098 [0000:000c] MaxExtraParagraphs 0xffff [0000:000e] InitialSs 0x00b5 [0000:0010] InitialSp 0x0080 [0000:0012] Checksum 0x1399 [0000:0014] InitialIp 0x0010 [0000:0016] InitialCs 0x0011 [0000:0018] RelocTableOffset 0x001e [0000:001a] OverlayNumber 0x0000 $ rabin2 -H COUNTL.EXE [0000:0000] Signature MZ [0000:0002] BytesInLastBlock 0x0044 [0000:0004] BlocksInFile 0x0003 [0000:0006] NumRelocs 0x0000 [0000:0008] HeaderParagraphs 0x0020 [0000:000a] MinExtraParagraphs 0x0098 [0000:000c] MaxExtraParagraphs 0xffff [0000:000e] InitialSs 0x00b5 [0000:0010] InitialSp 0x0080 [0000:0012] Checksum 0x0000 [0000:0014] InitialIp 0x0010 [0000:0016] InitialCs 0x0011 [0000:0018] RelocTableOffset 0x001e [0000:001a] OverlayNumber 0x0000 $ radiff2 COUNTE.EXE COUNTL.EXE 0x00000012 9913 => 0000 0x00000012 0x00000301 00 => c3 0x00000301
Either way, compression has changed the value of the
min_extra_paragraphs field
(which Rabin2 calls MinExtraParagraphs)
from 0x0000 to 0x0098 (152 decimal).
Where does this come from?
The formula for the size of the program text is
blocks_in_file − 1) + bytes_in_last_block − 16×header_paragraphs
The formula for the size of the additional memory is
min_extra_paragraphs
Adding these two values together gives the total runtime size of the program.
| file | program size | extra size | total size |
|---|---|---|---|
| COUNT.EXE | 2569 | 0 | 2569 |
| COUNTE.EXE | 580 | 2432 | 3012 |
The difference in program sizes accounts for 124 of the
152 paragraphs in the min_extra_paragraphs of COUNTE.EXE.
The remaining 28 paragraphs come from the size of the EXEPACK block itself
and its 8-paragraph stack (see initial_sp).
In this case, min_extra_paragraphs had to increase
to account for the overhead of the EXEPACK block.
But if the original min_extra_paragraphs is large enough,
the EXEPACK block can make use of the same space,
and therefore the difference in min_extra_paragraphs
is simply the difference in program sizes.
The formula used by EXEPACK.EXE to compute the new min_extra_paragraphs is:
out.min_extra_paragraphs = in.program_paragraphs + max(in.min_extra_paragraphs, exepack_paragraphs+8) − out.program_paragraphs
The formula comes from reverse engineering part of the program:
fix_exe_header:
0be6 55 push bp
0be7 8bec mov bp, sp
0be9 b80a00 mov ax, 10
; Reserve space for local variables.
; bp-10 uint16_t exepack_paragraphs
; bp-8 uint16_t out_file_size_low
; bp-6 uint16_t out_file_size_high
0bec e8ed04 call stack_check
0bef 57 push di
0bf0 56 push si
0bf1 a1c42b mov ax, word [exepack_size]
0bf4 050f00 add ax, 15
0bf7 b104 mov cl, 4
0bf9 d3e8 shr ax, cl
0bfb 8946f6 mov word [exepack_paragraphs], ax ; exepack_paragraphs = (exepack_size+15)/16
0bfe b80200 mov ax, 2
0c01 50 push ax
0c02 2bc0 sub ax, ax
0c04 50 push ax
0c05 50 push ax
0c06 ff36bc2b push word [out_fd]
0c0a e8db06 call file_seek
0c0d 83c408 add sp, 8
0c10 8946f8 mov word [out_file_size_low], ax
0c13 8956fa mov word [out_file_size_high], dx ; out_file_size = file_seek(out_fd, 0, 0, SEEK_END)
0c16 80e401 and ah, 1
0c19 a3a802 mov word [out_bytes_in_last_block], ax ; out_bytes_in_last_block = out_file_size % 512
0c1c 8b46f8 mov ax, word [out_file_size_low]
0c1f 05ff01 add ax, 511
0c22 83d200 adc dx, 0
0c25 b109 mov cl, 9
0c27 e87005 call shr_long
0c2a a3aa02 mov word [out_blocks_in_file], ax ; out_blocks_in_file (out_file_size+511)/512
0c2d a1c02b mov ax, word [in_exe_size_low]
0c30 8b16c22b mov dx, word [in_exe_size_high]
0c34 b104 mov cl, 4
0c36 e86105 call shr_long ; dx:ax = in_exe_size/16
0c39 8b4ef6 mov cx, word [exepack_paragraphs]
0c3c 03c8 add cx, ax
0c3e 890eb402 mov word [out_ss], cx ; out_ss = in_exe_size/16 + exepack_paragraphs
0c42 c706b6028000 mov word [out_sp], 0x80 ; out_sp = 0x80
0c48 a15007 mov ax, word [compressed_paragraphs]
0c4b 0106bc02 add word [out_cs], ax ; out_cs += compressed_paragraphs
0c4f a1c02b mov ax, word [in_exe_size_low]
0c52 8b16c22b mov dx, word [in_exe_size_high]
0c56 b104 mov cl, 4
0c58 e83f05 call shr_long ; dx:ax = in_exe_size/16
0c5b 8b4ef6 mov cx, word [exepack_paragraphs]
0c5e 83c108 add cx, 8 ; cs = exepack_paragraphs+8
0c61 8bf8 mov di, ax
0c63 3b0e5c07 cmp cx, word [in_min_extra_paragraphs] ; exepack_paragraphs+8 >= in_min_extra_paragraphs?
0c67 7305 jae l1
; in_min_extra_paragraphs is greater.
0c69 a15c07 mov ax, word [in_min_extra_paragraphs] ; ax = in_min_extra_paragraphs
0c6c eb06 jmp set_min_extra_paragraphs
l1:
; exepack_paragraphs+8 is greater.
0c6e 8b46f6 mov ax, word [exepack_paragraphs]
0c71 050800 add ax, 8 ; ax = exepack_paragraphs+8
set_min_extra_paragraphs:
0c74 2b46f6 sub ax, word [exepack_paragraphs] ; ax -= exepack_paragraphs
0c77 03c7 add ax, di ; ax += in_exe_size/16
0c79 2b065007 sub ax, word [compressed_paragraphs] ; ax -= compressed_paragraphs
; out_min_extra_paragraphs = in_exe_size/16 + max(in_min_extra_paragraphs, exepack_paragraphs+8) - (compressed_paragraphs + exepack_paragraphs)
0c7d a3b002 mov word [out_min_extra_paragraphs], ax
0c80 a15e07 mov ax, word [in_max_extra_paragraphs]
0c83 a3b202 mov word [out_max_extra_paragraphs], ax ; out_max_extra_paragraphs = in_max_extra_paragraphs
Microsoft EXEPACK.EXE will refuse to run
if the output file would be bigger than the input file.
My exepack program does support this, though,
so it uses a slightly more complicated formula
(which is equivalent in the case that out.program_paragraphs ≤ in.program_paragraphs):
out.min_extra_paragraphs = max(in.program_paragraphs + in.min_extra_paragraphs,in.program_paragraphs + exepack_paragraphs + 8,out.program_paragraphs + exepack_paragraphs + 8out.program_paragraphs
When UNP decompresses a file,
it sets min_extra_paragraphs according to the formula
out.min_extra_paragraphs = max(0x1000, in.program_paragraphs + 512 + in.min_extra_paragraphs) − 512 − out.program_paragraphs
In the case of a largish program that has in.program_paragraphs + in.min_extra_paragraphs ≥ 0x1000 − 512,
the formula simplifies to
out.min_extra_paragraphs = in.program_paragraphs + in.min_extra_paragraphs − out.program_paragraphs
This computation can be read from the file u4.asm in the
UNP source code.
The MoreStrucInfo label sets
TotalMem = max(0x1000, in_ExeImageSz/16 + EXTRAMEM + in_MinParMem):
MoreStrucInfo:
; ...
mov ds,SegEHInfo.A
ASSUME ds:NOTHING
mov ax,ExeImageSz
mov dx,ExeImageSz+2
div ParSize ;; ax = ExeImageSz / 16
xor dx,dx
add ax,EXTRAMEM ;; ax += 512
adc dl,0
add ax,ds:[MinParMem] ;; ax += MinParMem
adc dl,0
or dx,dx ; size above 1Mb ?
jne LoadError
cmp ax,01000h ; 64K?
jae UseMem
mov ax,01000h ;; ax = 0x1000
UseMem:
mov TotalMem,ax ;; TotalMem = ax
The CalcSize label then does
out_MinParMem = TotalMem − EXTRAMEM − (out_ExeImageSz+1)/16:
CalcSize: mov ax,ProgFinalSeg ; calculate new image size xor dx,dx sub ax,SegProgram sbb dx,0 mov cx,4 LongMul16: shl ax,1 rcl dx,1 loop LongMul16 ;; dx:ax = (ProgFinalSeg - SegProgram) * 16 add ax,ProgFinalOfs adc dx,0 ;; dx:ax += ProgFinalOfs add ax,ExeSizeAdjust adc dx,[ExeSizeAdjust+2] ;; dx:ax += 1 (not sure what this is for) mov ExeImageSz,ax mov ExeImageSz+2,dx div ParSize xchg ax,bx ;; bx = ExeImageSz/16 mov ax,TotalMem ;; ax = TotalMem sub ax,EXTRAMEM ;; ax += EXTRAMEM sub ax,bx ;; ax -= ExeImageSz/16 cmp ax,0A000h jb MinMemOk xor ax,ax ; no minimal memory MinMemOk: cmp HeaderStored,0 jne _label01 mov es:[MinParMem],ax ;; MinParMem = ax
See it in action using the compressed COUNTE.EXE from the EXEPACK.EXE section:
C:\>UNP.EXE -v COUNTE.EXE COUNTEU.EXE UNP 4.11 Executable file restore utility, written by Ben Castricum, 05/30/95 INFO - DOS Version 5.00 INFO - Commandline = "E -I -K+ -U -V COUNTE.EXE COUNTEU.EXE". INFO - Using UNPTEMP$.$$$ as temp file. INFO - Wildcard matches 1 filename(s), stored at 0000h. INFO - Program loaded at 0192h, largest free memory block: 632123 bytes. processing file : COUNTE.EXE DOS file size : 1092 file-structure : executable (EXE) EXE part sizes : header 512 bytes, image 580 bytes, overlay 0 bytes INFO - File uses 0 fixups and requires atleast 3012 bytes to load. INFO - Loading program at 1010h, blocksize 65536 bytes. INFO - Required mem. 0098h, desired mem. FFFFh, header slack 484 bytes. processed with : EXEPACK V4.00 action : decompressing... done new file size : 2608 writing to file : COUNTEU.EXE
$ rabin2 -H COUNTE.EXE [0000:0000] Signature MZ [0000:0002] BytesInLastBlock 0x0044 [0000:0004] BlocksInFile 0x0003 [0000:0006] NumRelocs 0x0000 [0000:0008] HeaderParagraphs 0x0020 [0000:000a] MinExtraParagraphs 0x0098 [0000:000c] MaxExtraParagraphs 0xffff [0000:000e] InitialSs 0x00b5 [0000:0010] InitialSp 0x0080 [0000:0012] Checksum 0x1399 [0000:0014] InitialIp 0x0010 [0000:0016] InitialCs 0x0011 [0000:0018] RelocTableOffset 0x001e [0000:001a] OverlayNumber 0x0000 $ rabin2 -H COUNTEU.EXE [0000:0000] Signature MZ [0000:0002] BytesInLastBlock 0x0030 [0000:0004] BlocksInFile 0x0006 [0000:0006] NumRelocs 0x0001 [0000:0008] HeaderParagraphs 0x0002 [0000:000a] MinExtraParagraphs 0x0d5f [0000:000c] MaxExtraParagraphs 0xffff [0000:000e] InitialSs 0x0000 [0000:0010] InitialSp 0x0100 [0000:0012] Checksum 0x1399 [0000:0014] InitialIp 0x000c [0000:0016] InitialCs 0x0094 [0000:0018] RelocTableOffset 0x001c [0000:001a] OverlayNumber 0x0000
TotalMem is set to
The size of this program is below UNP's minimum memory threshold.
Then out_MinParMem becomes
Because UNP does not round up its
in_ExeImageSz and out_ExeImageSz
to a multiple of 16 before dividing,
it may compute a value of out_MinParMem
that is 1 paragraph smaller than it should be.