This is a look into how certain EXEPACK-related programs
handle the min_extra_paragraphs field in the
EXE header.
This field is also known as e_minalloc
in IMAGE_DOS_HEADER terms.
I found a copy of EXEPACK.EXE at the
PCjs page for Microsoft Macro Assembler 4.00.
There are various other versions available.
Select the dropdown for the B: drive,
ensure MS Macro Assembler 4.00 is selected,
click Load,
then Save.
WinWorld
is another source for disk images.
The Internet Archive has a
"Microsoft MASM 4 beta",
whose differences from 4.00 I did not examine thoroughly.
Inside DOSBox or similar,
you can run the programs and see the version numbers.
The disk image conveniently comes with a sample program, COUNT.ASM.
Let's EXEPACK-compress it two ways,
using EXEPACK.EXE and the /EXEPACK option to LINK.EXE.
The two compressed files are not identical.
Using Rabin2
and Radiff2,
we see that there are only trivial differences:
COUNTE.EXE has a checksum of 0x1399; COUNTL.EXE has a checksum of 0x0000.
COUNTE.EXE is padded with 0x00 bytes; COUNTL.EXE is padded with 0xc3 (which is also the last byte of the file, a ret instruction).
Either way, compression has changed the value of the
min_extra_paragraphs field
(which Rabin2 calls MinExtraParagraphs)
from 0x0000 to 0x0098 (152 decimal).
Where does this come from?
The formula for the size of the program text is
The formula for the size of the additional memory is
Adding these two values together gives the total runtime size of the program.
file
program size
extra size
total size
COUNT.EXE
2569
0
2569
COUNTE.EXE
580
2432
3012
The difference in program sizes accounts for 124 of the
152 paragraphs in the min_extra_paragraphs of COUNTE.EXE.
The remaining 28 paragraphs come from the size of the EXEPACK block itself
and its 8-paragraph stack (see initial_sp).
In this case, min_extra_paragraphs had to increase
to account for the overhead of the EXEPACK block.
But if the original min_extra_paragraphs is large enough,
the EXEPACK block can make use of the same space,
and therefore the difference in min_extra_paragraphs
is simply the difference in program sizes.
The formula used by EXEPACK.EXE to compute the new min_extra_paragraphs is:
The formula comes from reverse engineering part of the program:
Microsoft EXEPACK.EXE will refuse to run
if the output file would be bigger than the input file.
My exepack program does support this, though,
so it uses a slightly more complicated formula
(which is equivalent in the case that out.program_paragraphs ≤ in.program_paragraphs):
UNP 4.11
When UNP decompresses a file,
it sets min_extra_paragraphs according to the formula
In the case of a largish program that has in.program_paragraphs + in.min_extra_paragraphs ≥ 0x1000 − 512,
the formula simplifies to
This computation can be read from the file u4.asm in the
UNP source code.
The MoreStrucInfo label sets
TotalMem = max(0x1000, in_ExeImageSz/16 + EXTRAMEM + in_MinParMem):
The CalcSize label then does
out_MinParMem = TotalMem − EXTRAMEM − (out_ExeImageSz+1)/16:
See it in action using the compressed COUNTE.EXE
from the EXEPACK.EXE section:
TotalMem is set to
The size of this program is below UNP's minimum memory threshold.
Then out_MinParMem becomes
Because UNP does not round up its
in_ExeImageSz and out_ExeImageSz
to a multiple of 16 before dividing,
it may compute a value of out_MinParMem
that is 1 paragraph smaller than it should be.