John's Mutant

Thu Jan 2 08:13:41 2014 UTC

John's Mutant contains a compiler for the mutant language. The compiler itself is written in Python 2.7.x. Currently it can generate nasm 2.X output, tested with 2.09.08. Mutant is a language similar in style to Pascal, but using slightly different keywords and grammar so students cannot just recycle something they find on the Internet.

The language has 3 targets: FreeDOS .COM file, 32-bit Linux, and 64-bit Linux. The 64-bit Linux version violates the AMD64 ABI because it does not guarantee proper stack alignment yet. One unusual feature is that on Linux the generated code calls the kernel system calls directly and does not use any libc.

This project was suspended for a long time, but development is resuming. The first step was updating this web page and getting the source code into the mercurial version control. Future steps include:

  1. Changing the strange keywords to normal English keywords.  (done)
  2. Fixing the 64-bit ABI violations.  (done)
  3. Fix library issues.
  4. Add more test cases.  (done)
  5. Rewrite symbol table.  (done)
    The original symbol table just cannot handle nested functions, properly deal with the function parameters, or constants.
  6. Testing with current Python and nasm versions.  (done)
  7. Add GNU gas assembler output option.

You can access the buraphakit sourceforge site to access the source code or download a released version.

Adventures in DOS Debugging

I shouldn't admit this since it is rather embarrassing, but I'm hoping if someone else reads this I can save them the amazing amount of pain I endured as I lost 6 days debugging the 16bit DOS output. I was using a local buffer in the itoa library functions. Because DOS does not use protected mode, it cannot protect itself against wild writes to memory. Even if you are in a modern protected-mode operating system you can do still wild writes to your own stack.

I was incorrectly using MOV instead of LEA and was using some random part of the DOS operating system area for the buffer. Since the buffer was small, sometimes the write occurred in parts of DOS that were not in use and the program would run. But you could change one line in the assembler somewhere and then the program would write to a slightly different spot of DOS that killed it. The symptom was that normally the program ran fine, but on exit the machine would hang. Not only that, if you ran the code in a debugger like ancient Turbo Debugger, ancient 16bit CodeView, ancient Watcom debugger, etc. it would exit normally. Until this year I never tried programming DOS at this level, and this was a good example of where a debugger just doesn't help if the programmer doesn't check each and every buffer address accessed in the program. I mean I just knew it wasn't me, so it had to be DOS, right? Uh, no, I should have thought harder and realized while DOS may not be perfect, it's known for reliably running .COM programs if and only if they are correctly coded.

So how did I fall into this trap? It was easy. If you have a simple variable in the data segment, you can load the data like this:

mov di,[varname]

That works fine. But when I needed to use a base pointer relative address, I tried this:

mov di,[bp-8]

It assembles! It runs! It hangs on exit! This was harder to see at first because I have defined macros for local variables like this:

%define varname  bp-8
...
mov di,[varname]
...
mov al,30h
...
mov strict byte [di],al

This took the value of the base pointer, subtracted four, and used that as the address. What I wanted was the address stored at that location, not that location. I needed to do this:

%define varname  bp-8
...
lea di,[varname]
...
mov al,30h
...
mov strict byte [di],al

What hurts most of all is I had it correct in the 64 and 32 bit versions, but had it wrong in 16 bit. I kept thinking it was a DOS problem and never even compared the assembler output of the three versions. I tried debugx, insight, and many other lesser-known debuggers but that didn't help. I decided the DOSBox I was using might be bad and found some real floppy disks and booted up FreeDOS. Then I decided maybe FreeDOS and DOSBox both had a bug in some undocumented area of DOS, so I found a copy of DR-DOS 7. Still no joy. I then scrounged around and found a genuine MS-DOS 6.22 boot floppy and when the code caused that to hang it began to dawn on me maybe the program was at fault. As part of getting the debuggers to run of course I had to learn all kinds of stuff about CSWDPMI and other things. On the plus side, I do have a pretty comprehensive set of DOS debuggers for next time. But you can rest assured that next time the first thing I will do is look for places I should have used lea.

Moral: Blame it on DOS is a cop out.

The programmer, not DOS, is responsible for ensuring the program does not commit wild writes. I thought that since all four segment registers were set to the same value and addresses are limited to 16 bits the only wild writes would be into my own program. Actually bp relative addresses can escape, and also it is possible to overwrite parts of the PSP (first 256 bytes of the COM file when loaded into memory). And of course you can wipe out your own stack too. Throw all that 'the O/S will protect itself' thinking away when you program for DOS. Even if you had protected mode, the stack must be writable even if the PSP and the O/S was not.