
CRINKLER - Compressing linker for Windows specialized for 4k intros

Aske Simon Christensen "Blueberry/Loonies"
Rune L. H. Stubbe "Mentor/TBC"

Version 1.0a (January 7, 2007)



VERSION HISTORY
---------------

07.01.07: 1.0a: New /VERBOSE:FUNCTIONS options to sort the functions.
                Various verbose output fixes.
                Various crash fixes.
                A fix to the /FIX Crinkler version recognizer.

27.12.06: 1.0:  Output EXE files are now Windows Vista compatible.
                Compression tweak for greatly improved compression ratio.
                Much faster compression.
                Automatically takes advantage of multiple processors.
                Improved Visual Studio 2005 integration.
                /COMPMODE:INSTANT option for very quick compression.
                /ORDERTRIES option to try out different section orderings.
                /SAFEIMPORT option to insert a check for nonexisting DLLs.
                /PROGRESSGUI option for a graphical progress bar.
                /REPLACEDLL option to replace one DLL with another.
                /FIX option to fix compatibility problems of older versions.

09.02.06: 0.4a: Fixed linker crash problem with blank member entries
                in some library files (such as glut32).
                The /PRIORITY option was not mentioned in the
                commandline usage help.

18.12.05: 0.4:  Changed header and import code to make output EXE files
                compatible with 64-bit versions of Windows.
                Fixed a bug in the ordinal range import mechanism.
                Added a switch to control the process priority.
                Added a warning for range import of an unused DLL.
                Some more header squeezing.

31.10.05: 0.3:  Output EXE files are now Windows 2000 compatible.
                Added a number of verbose options to output useful
                information about the program being compressed.
                Added an option for transforming function calls to
                use absolute offsets to improve compression.
                Fixed a bug in the linker regarding identically named
                sections.
                Fixed a potential crash bug in the linker.
                Various small tweaks and optimizations.

23.07.05: 0.2:  Fixed bug in the decompressor.
                Changed the behaviour of the /CRINKLER option.
                Added timing to the progress bars.
                Some updates to the manual and usage description.

21.07.05: 0.1:  First release.



BACKGROUND
----------

Ever since the concept of size-limited demo competitions was
introduced in the early 1990's (and before that as well), people have
been using executable file compressors to reduce the size of their
final executables. An executable file compressor is a program that
takes as input an executable file and produces a new executable file
which has the same behaviour as the original one but is (hopefully)
smaller.

The usual technique employed by executable file compressors is to
compress the contents of the executable file using some general
purpose data compression method and prepend to this compressed data a
small piece of code (the decompressor) which decompresses the contents
into memory in such a way that it looks to the code as if the original
executable file had been loaded into memory in the normal way.

The size of the decompressor is usually around a few hundred bytes,
depending on the complexity of the compression method. This
constitutes an unavoidable overhead in the compressed file, which is
particularly evident for small files, such as 4k intros. Furthermore,
the header of the Windows EXE file format contains a lot of
information that needs to be there at fixed offsets in order for
Windows to be able to load the file. The presence of these overheads
from the header and decompressor has motivated people to look for
other means of compressing their 4k intros.

Until Crinkler came around, the most popular strategy for compressing
4k intros for Windows was CAB dropping: A few simple transformations
are performed on the executable to make it compress better (such as
merging sections and setting unused header fields to zero), and the
result is compressed using the Cabinet Compression tool included with
Windows. The resulting .CAB file is renamed to have .BAT extension,
and some commands are inserted into the file such that when the .BAT
file is executed, it decompresses the executable to disk (using the
Cabinet decompression command), runs the executable and then deletes
the executable again. This saves the size of the decompression code
(since an external program is used to do the decompression) and some
of the size of the header (since the header can be compressed).

Various dropping strategies combined with other space-saving hacks
people have employed on their 4k intros (in particular import by
ordinal) have caused severe compatibility problems. More often than
not, people who want to run a newly released 4k intro find that it
does not work on their own machine. In recent years, it has been
customary to include a 'compatible' version in the distribution which
is larger than 4k but works on all machines. The term '4k intro' seems
to mean '4k on the compo machine' intro.

The main motivation for starting the Crinkler project has been the
feeling that the existing means available for compressing 4k intros
are unsatisfactory. We want 4k intros that are self-contained EXE
files. We want 4k intros that are 4 kilobytes in size. Our aim for
Crinkler is to be the cleanest, most effective and most compatible
executable compressor for Windows 4k intros.



INTRODUCTION
------------

Crinkler is a different approach to executable file compression. While
an ordinary executable file compressor operates on the executable file
produced by the linker from object files, Crinkler replaces the linker
by a combined linker and compressor. The result is an EXE file which
does not do any kind of dropping. It decompresses into memory like a
traditional executable file compressor.

Crinkler employs a range of techniques to reduce the size of the
resulting EXE file beyond what is usually obtained by using CAB
compression:

- Having control over the linking step gives much more flexibility in
  the optimizations and transformations possible on the data before
  and after compression.

- The compression technique used by Crinkler is based on context
  modelling, which is far superior in compression ratio to the LZ
  variants used by CAB and most other compressors. The disadvantage of
  context modelling is that it is extremely slow, but this is of
  little importance when only 4 kilobytes need to be compressed. It
  also needs quite a lot of memory for decompression, but this is
  again not a problem, since the typical 4k intro uses a lot of memory
  anyway.

- The actual compression algorithm performs many passes over the data
  in order to optimize the internal parameters of the compressor. This
  results in slower compression, but this is usually a reasonable
  price to pay for the extra bytes gained on the file size.

- The contents of the executable are split into two parts - a code
  part and a data part - and each of these are compressed
  individually. This leads to better compression, as code and data are
  usually very different in structure and so do not benefit from being
  compressed together.

- DLL functions are imported by hash code. This is robust to
  structural changes to the DLL between different versions while being
  quite compact - only 4 bytes per imported function. For DLLs with
  fixed relative ordinals (such as opengl32), a special technique,
  ordinal range import, can be used to further reduce the number of
  hash codes needed.

- Much of the data in the EXE header is actually ignored by the EXE
  loader. Some of this space is used for some of the decompression
  code, and the rest is used to store hash codes for imported
  functions.

Using Crinkler is somewhat different from using an ordinary executable
file compressor because of the linking step. In the following
sections, we describe its use in detail.



INSTALLATION
------------

To use as a stand-alone linker, Crinkler does not need any
installation. Simply run crinkler.exe from the commandline with
appropriate arguments, as described in the next section.

However, if you are using Microsoft Visual Studio to develop your
intro, the easiest way to use Crinkler is to run it in place of the
normal Visual Studio linker. Crinkler has been designed as a drop-in
replacement of the Visual Studio linker, supporting the same basic
options. All of the options can then be set using the Visual Studio
configuration window.

Unfortunately, Visual Studio does not (as of this writing) support
replacing its linker by a different one. So what you have to do is the
following:

- Copy crinkler.exe to your project directory and rename it to
  link.exe. Visual Studio will then find it when it tries to invoke
  the linker. If you are using some other linker with a different
  name, such as the one used with the Intel C++ compiler, call it
  whatever the name of the linker is.

- If you are using Visual Studio 2005, select Tools/Options... and go
  to Projects and Solutions/VC++ Directories. At the top of the list
  for Executable files, add $(SolutionDir). This will make sure that
  the project directory is searched for the linker executable.

- In the Release configuration (or whichever configuration you want to
  enable compression), under Linker/Command Line/Additional Options,
  type in /CRINKLER, along with any other Crinkler options you want to
  set. See the next section for more details on options. If you are
  using Visual Studio 2005, set Linker/Manifest File/Generate Manifest
  to No.

If you have Visual Studio installed but want to run Crinkler from the
commandline, the easiest way is to use the Visual Studio Command
Prompt (available from the Start menu), since this sets up the LIB
environment variable correctly. You can read off the value of the
environment variables by running the 'set' command in this command
prompt. If you are using a different command prompt, you will have to
set up the LIB environment variable manually, or use the /LIBPATH
option.



USAGE
-----

The general form of the command line for Crinkler is:

CRINKLER [options] [object files] [library files] [@commandfile]

When running from within Visual Studio, the object files will be the
ones generated from the sources in the project. The library files will
be the standard set of Win32 libraries, plus any additional library
files specified under Linker/Input/Additional Dependencies. If you are
using a standard runtime library, such as libc or msvcrt, you will
have to specify this one manually.


The following options are compatible with the VS linker and can be set
using switches in the Visual Studio configuration window:

/SUBSYSTEM:CONSOLE
/SUBSYSTEM:WINDOWS
(Linker/System/SubSystem)

    Specify the Windows subsystem to use. If the subsystem is CONSOLE,
    a console window will be opened when the program starts. The
    subsystem also determines the name of the default entry point (see
    /ENTRY). The default subsystem is WINDOWS.

/OUT:[file]
(Linker/General/Output File)

    Specify the name of the resulting executable file. The default
    name is out.exe.

/ENTRY:[symbol]
(Linker/Advanced/Entry Point)

    Specify the entry label in the code. The default entry label is
    mainCRTStartup for CONSOLE subsystem applications and
    WinMainCRTStartup for WINDOWS subsystem applications.

/LIBPATH:[path]
(Linker/General/Additional Library Directories)

    Add a number of directories (separated by semicolons) to the ones
    searched for library files. If a library is not found in any of
    these, the directories mentioned in the LIB environment variable
    are searched.

@commandfile

    Commandline arguments will be read from the given file, as if they
    were given directly on the commandline.


In addition to the above options, a number of options can be given to
control the compression process. These can be specified under
Linker/Command Line/Additional Options:

/CRINKLER

    Enable the Crinkler compressor. If this option is disabled,
    Crinkler will search through the path for a command with the same
    name as itself, skipping itself, and pass all arguments on to this
    command instead. This will normally invoke the Visual Studio
    linker. If the name of the Crinkler executable is crinkler.exe,
    this option is enabled by default, otherwise it is disabled by
    default.

/FIX

    Fix a program compressed using an older version of Crinkler so
    that it works on all versions of Windows 2000, XP and Vista. When
    this option is specified, all other options except /OUT are
    ignored, and Crinkler takes a single file argument, which must be
    an output EXE file from Crinkler 0.3 or newer.

/PRIORITY:IDLE
/PRIORITY:BELOWNORMAL
/PRIORITY:NORMAL

    Select the process priority at which Crinkler will run while
    compressing. The default priority is BELOWNORMAL. Use IDLE if you
    want Crinkler to disturb you as little as possible. Use NORMAL if
    you don't need your machine for anything else while compressing.

/COMPMODE:INSTANT
/COMPMODE:FAST
/COMPMODE:SLOW

    Choose between three different compression modes. The FAST mode
    usually compresses in a couple of seconds, while the SLOW one can
    take up to a few minutes to complete. The slow one usually
    compresses about 10-40 bytes better on a 4k executable. Use
    INSTANT if you just want to check that your program works in
    compressed form and don't care about the size. The default
    compression mode is FAST.

/HASHSIZE:[memory size]

    Specify the amount of memory the decompressor is allowed to use
    while decompressing, in megabytes. In general, the more memory the
    decompressor is allowed to use, the better the compression ratio
    will be, though only slightly. The memory requirements of the
    final executable (the size of the executable image when loaded
    into memory) will be the maximum of this value and the original
    image size. The memory will not be deallocated until the program
    terminates, and any heap allocation the program performs will add
    to this memory usage. The default value is 100, which is usually a
    good compromise.

/HASHTRIES:[number of retries]

    Specify the number of different hash table sizes the compressor
    will try in order to find one with few collisions. More tries lead
    to longer compression time but slightly better compression. The
    default value is 20. Higher values rarely improve the size by more
    than a few bytes.

/ORDERTRIES:[number of retries]

    Specify the number of section reordering iterations that the
    linker will try out in search for the ordering that gives the best
    compression ratio. The default is not to do any reordering.
    Specifying this option drastically increases the compression time,
    since Crinkler has to calculate the compressed size anew on every
    reordering. Usually, the size does not improve noticeably after a
    few thousand iterations.

/RANGE:[DLL name]

    Import functions from the given DLL (without the .dll suffix)
    using ordinal range import. Ordinal range import imports the first
    used function by hash and the rest by ordinal relative to the
    first one. Ordinal range import is safe to use on DLLs in which
    the ordinals are fixed relative to each other, such as opengl32 or
    d3dx9_??. This option can be specified multiple times, for
    different DLLs.

/REPLACEDLL:[oldDLL=newDLL]

    Whenever a function is imported from oldDLL, import it from newDLL
    instead. DLL replacement is useful when the end user might not
    have the version of the DLL that you are linking to. Typical
    examples are replacing msvcr70, msvcr71 or msvcr80 by msvcrt and
    replacing one version of d3dx9_?? by another. Only use this option
    if you know that the two DLLs are compatible. When REPLACEDLL and
    RANGE are used together, RANGE must refer to the new DLL.

/SAFEIMPORT

    If the executable fails to load some DLL, it will pop up a message
    box with the DLL name rather than crash. This will increase the
    file size by around 20 bytes.

/TRANSFORM:CALLS

    Change the relative jump offsets in all internal call instructions
    (E8 opcode) into absolute offsets from the start of the code. This
    usually improves compression, since multiple calls to the same
    function become identical. The transformation has an overhead of
    about 20 bytes for the detransformation code, but the net savings
    on a full 4k can be as large as 50 bytes, depending on the number
    of calls in your code.


Finally, Crinkler has a number of options for controlling the output
during compression. Just like the other options, these can be
specified under Linker/Command Line/Additional Options:

/VERBOSE:FUNCTIONS
/VERBOSE:FUNCTIONS-BYNAME
/VERBOSE:FUNCTIONS-BYSIZE

    For each function in the program, output the size of the function
    in bytes before and after compression. This can help you decide
    whether specific optimizations you try out on your code are
    worthwhile. Note that the compressed size of a function depends
    heavily on what comes before it. If you have two almost identical
    functions, the second one will be much smaller, since the contents
    of the first one will guide the compression of the second.

    By default, the functions are written in the order in which they
    occur in the linked code. If you specify -BYNAME or -BYSIZE, they
    will be sorted in alphabetical order or by decreasing compressed
    size, respectively.

    For this option to work, you must enable debugging information in
    your object files. In Visual Studio, this is done by selecting a
    debugging information format under C/C++/General/Debug Information
    Format. Any format other than Disbled will work.

/VERBOSE:LABELS

    As /VERBOSE:FUNCTIONS, but instead of showing the size of
    functions, the sizes between every label is shown. This will show
    the sizes of code as well as data and does not require any
    debugging information apart from the labels.

/VERBOSE:IMPORTS

    List all functions imported from DLLs. The functions are grouped
    by DLL, and functions imported by ordinal range import are grouped
    into ranges.

/VERBOSE:MODELS

    List the model masks and weights selected by the compressor. This
    is mostly for internal use.

/PROGRESSGUI

    Open a window showing a graphical progress indicator.


An example commandline for linking and compressing an intro could look
like this (split on multiple lines for readability):

crinkler.exe /OUT:micropolis.exe /SUBSYSTEM:WINDOWS /RANGE:opengl32
 /COMPMODE:SLOW /ORDERTRIES:1000 /VERBOSE:IMPORTS /VERBOSE:LABELS
 kernel32.lib user32.lib gdi32.lib opengl32.lib glu32.lib winmm.lib
 micropolis\startup.obj micropolis\render.obj
 micropolis\render-asm.obj micropolis\sound.obj
 micropolis\sound-asm.obj



RECOMMENDATIONS
---------------

There are a number of things you can do as intro programmer to boost
the compression achieved by Crinkler even further. This section
gives some advice on these.

- Do not use a standard runtime library. If you really need some
  standard library functions, use msvcrt.lib and use your own entry
  point using the /ENTRY option. This will skip the standard startup
  code, saving you around half a kilobyte.

- Since much of the effectiveness of Crinkler comes from separating
  code and data into different parts of the file and compressing each
  part individually, it is important that this separation is
  possible. Mark your code and data sections as containing code and
  data, respectively, and do not put both code and data into the same
  section. See your assembler manual for information about how to do
  this. For instance, in Nasm, you can write the keyword "text" or
  "data" after the section name and give sections different names to
  prevent them from being merged by the assembler.

- Split both your code and your data into as many sections as
  possible. This gives Crinkler more opportunities to select the
  ordering of the sections to optimize the compression ratio.

- If you are using OpenGL, use ordinal range import for opengl32. If
  you are using Direct3D, use ordinal range import for d3dx9_??. This
  drastically reduces the space needed for function hash codes.

- Avoid large blocks of data, even if they are all zero. Use
  uninitialized (bss) sections instead. Crinkler does not cope well
  with large amounts of data. Be aware that the compressor may use an
  amount of memory up to about 4000 times the uncompressed code/data
  size (whichever is largest).

- When you perform detailed size comparisons, always use the SLOW
  compression mode with plenty of HASHTRIES. The INSTANT and FAST
  modes are only intended for use during testing and to give a rough
  estimate of the compressed size. Also note that the compression is
  tuned for the 4k size target, so any size comparisons you perform on
  smaller files might turn out to behave differently when you get
  nearer to 4k.

- As a matter of good conduct, always use SAFEIMPORT if you can spare
  the space, and do not set HASHSIZE higher than you need. In other
  words, if your final intro is well below the size limit, enable
  SAFEIMPORT and then lower HASHSIZE in order not to waste memory
  unnecessarily.



COMMON PROBLEMS, KNOWN BUGS AND LIMITATIONS
-------------------------------------------

Any DLL that is needed by a program that Crinkler compresses must be
available to Crinkler itself. If you get the error message 'Could not
open DLL ...', it means that Crinkler needed the DLL but could not
find it. You must place it either in the same directory as the
Crinkler executable or somewhere in the DLL path, such as
C:\WINDOWS\system32. Alternatively, you can use the REPLACEDLL option
to replace it by one that is available.

If you launch your Crinkler-compressed program from within Visual
Studio, use Start Without Debugging (Ctrl+F5) rather than Start
Debugging (F5). The debugger cannot handle Crinkler-compressed
executables. If the program crashes, you can still attach the debugger
in the normal way.

When running inside Visual Studio, the textual progress bars are not
updated correctly, since the Visual Studio console does not flush the
output until a newline is reached, even when explicitly flushed by the
running program. Use the /PROGRESSGUI option to get a graphical
progress bar.

The code for parsing object and library files contains only a minimum
of sanity checks. If you pass a corrupt file to Crinkler, it will most
likely crash.

The import code does not support forwarded RVA imports, which means
that some functions, such as HeapAlloc, cannot be used. This makes
Crinkler unable to link with libc. What a loss.

The final compressed size must be less than 64k, or Crinkler will fail
horribly. You shouldn't use it for such big files anyway.

Crinkler does not support the whole program optimization feature of
Visual Studio (General/Whole Program Optimization). Do not turn it on.



SUPPORT
-------

Try out Crinkler, and let us know what you think about it. If you have
any problems, comments or suggestions, please write a message at the
Pouet.net forum:

http://www.pouet.net/prod.php?which=18158

If you want to contact us directly, e.g. for sending us a file, write
to authors@crinkler.net.

The newest released version of Crinkler can always be found at
http://www.crinkler.net.



ACKNOWLEDGEMENTS
----------------

The compression technique used by Crinkler is much inspired by the PAQ
compressor by Matt Mahoney.

The import code is loosely based on the hashed imports code by Peci.

Many thanks to all the people who have given us comments, bug reports
and test material, especially to Rambo, Kusma, Polaris, Gargaj,
Frenetic, Buzzie, Shash, Auld, Minas, Skarab, Dwing, Freak5, Hunta,
Snq, Darkblade, Abductee, Las and Hitchhikr. Also thanks to Dwarf,
Polygon7 and Gargaj for suggestions for our web design.

Our special thanks to the many people who have demostrated the
usefulness of Crinkler by using it for their own productions.

Keep it going! We greatly appreciate your feedback.
