Sources for Rose Shank, a 4k intro for Assembly 2007
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Intro by:
  Jere Sanisalo - http://www.xmunkki.org
  Jetro Lauha   - http://jet.ro

Source release and the IL4 compiler by:
  Jere Sanisalo - http://www.xmunkki.org


Introduction
-=-=-=-=-=-=
Rose Shank was an experimental 4k intro. The experiment was to see if it was
feasible to build a small virtual machine with a custom bytecode to run the
intro, and if it was easier/smaller. It was a given that faster it is not. :)

This language was named IL4 Lisp-ahtava (the compiler is called il4c). As the
name says, the language uses the lisp syntax as its syntax. That's about all
there's common to it. The reason the lisp parantheses syntax was chosen, was
to facilitate fast turnabout while developing the language further (adding new
constructs was a lot easier this way, as the lexer/parser stayed the same).


Overview
-=-=-=-=
Il4c reads and parses all the given source files given to it at once. The files
are interpreted as if they were joined together, and all symbols are globally
visible. IL4 then performs a number of simple optimizations over the code.
This includes stripping functions that are never called. Not much is done to
the actual code. This gives greater control to the IL4 programmer (for example
in C a small change may make a huge effect on the inlining and code size in
general). All inlining and such should be done by hand. Il4c then generates a
suitable bytecode for the remaining code. Only the operations that are needed
are assigned a bytecode. As a last step, il4c generates one assembly file
"out.asm" which contains both the generated bytecode interpreter and the
actual bytecode. This asm file is designed to be compiled by nasm. The linker
used may be the microsoft linker, but Crinkler is the only linker ever tested
with il4c.

To compile Rose Shank, run compile.sh with cygwin. You may need to modify it
to set the correct library paths (uses the win32 libraries from Visual Studio).


IL4
-=-
A few words about the IL4 language (as there is virtually no ducmentation
about it) to get the prominent reader started.

Start from "test.il4" (the startup function "main" is there) and "core.il4"
(most of the basic math functions and such are there).

Comments start with a # char and go up to the end of the line.

The language is totally untyped (well, strictly speaking the type is
"32-bits" :). That means you have to be careful to call the right arithmetic
and comparison functions (as well as others). The values may very well be
pointers, integers, floats or whatever.

On the top level an IL4 file may contain roughly the following elements:
 - Constants. When referenced, the reference is replaced with the constant
   value. These are just handy shorthands for giving names to numbers.

   Example:
    (const pi 3.1415926535897932384626433832795)
 - Global variables. These may have a constant value set to them or not.

   Example:
    (var room_floor_lights)
    (var cam_ang 0.0)
 - Bytecode functions. These take 0 or more arguments, then execute the code
   in their body.

   Example:
    # Random function returning a float with an integer interval [mini,maxi[.
    (fun rand_rangef (mini maxi)
         i2f (rand_range mini maxi))

   Notes on example:
       "i2f" is a function that takes one argument, and it changes an integer
       to a float. "rand_range" is another function that takes 2 arguments.
 - Assembly functions. These are special functions. When called, these
   functions compile to one bytecode, and their raw assembly is executed
   instead. The code is copied as-is to the final output to the relevant
   location. Note that the arguments are passed to the stack from left to
   right (for asmfuns); that is, pop eax would pop the last argument. It's
   assumed that every asmfun pops all of their arguments from the stack, and
   place one value to the stack after they're done (every function returns
   something always).

   Example:
    (asmfun +i (a b)
            "pop eax"
            "add [esp], eax")
 - Compiler directives. Currently there is only one, "heapsize". This isn't
   used in Rose Shank, but it sets the amount of heap to allocate for the
   executable. The allocator in Rose Shank is a simple pointer incrementer,
   so make sure you have enough heap so the pointer doesn't grow over it.

Most of the things inside function bodies are:
 - Other function calls
 - "(if (expr) (then) (else))" where "expr", "then" and "else" are some code
   blocks.
 - Variable definitions (same as globals but inside functions).
 - Variable sets, "(set var-name value)"
 - While loops, "(while (expr) code)"
   This is the only looping mechanism in IL4.
 - External symbol pointer lookups, "(external_symbol "_wglCreateContext@4)"
   This compiles directly to a constant value, which is a pointer to the
   symbol. This helper form is used to call external C/asm functions.
 - Stdcalls/Cdecl calls. Some examples:
     (fun glVertex3fv (arr) (stdcall (external_symbol "_glVertex3fv@4") arr))
     (fun glViewport (x y w h) (stdcall (external_symbol "_glViewport@16") x y w h))
   There are different forms for functions which return a floating point value,
   namely "stdcall-fp" and "cdecl-fp".


Tips
-=-=
Modify the compile.sh script and change the parameters a little. There are a
few optimizations which really made the packed code bigger (even though the
raw object file was smaller). Also the "-save-debug" switch is useful as it
saves the middle forms (tree forms before/after optimization) for the program.
"il4c -help" gives the list of command line options.

Read & try to understand the generated "out.asm" file. Some quick notes about
it:
 - Every assembly function end with a "jmp ebx"
 - Every opcode has a 16bit loopup index to it's code
 - Every bytecode function has a 16bit lookup index to it's code
 - Every bytecode function has 16bits of function header and one return
   bytecode at the end (8bit); total overhead 3 bytes
 - Constants currently take 4 bytes each.
 - Every global that has a preset value takes 4 bytes (even if the same value
   is used as a constant elsewhere).
 - Globals that have no preset values take no space.

Doubles were a bit hacked in (as the general type is "32-bits"). The only
functions that required doubles in Rose Shank were OpenGL calls which follow
the stdcall calling convention. What we do, is use a asmfun which converts a
float to a double on the stack. Because the stdcall convention requires for
the function to clean it's own arguments, this hack works (as the correct
amount of items are on the stack after the call).

To compile the HD version, modify compile.sh to use "resolution_hd.il4"
instead of "resolution.il4".


License
-=-=-=-
These sources and the IL4 compiler are released under the GPLv2 license (see
LICENSE.txt).


Contact
-=-=-=-
Web: http://www.xmunkki.org/
