16 Jan

16 bits COM Oddity

I can’t even pinpoint what a 16 bits COM Oddity really means, but I think the idea is therein, somehow. Previously, I explained how to code a simple a “hello, world” program using the DEBUG tool that was shipped with DOS. Revisiting this obsolete knowledge was unexpectedly fun. We’ll retrieve the hexadecimal version of “hello, world” (well, “hello, world!!”) from that post:

EB 13 0D 0A 68 65 6C 6C 6F 2C 20 77 6F 72 6C 64
21 21 0D 0A 24 B4 09 BA 02 01 CD 21 B4 00 CD 21

That’s all we need for our “hello, world!!” binary. 32 bytes exactly. We can create that file bit by bit but that’d be too excessive, I think. Let’s use the echo command instead. This is the full command I entered in my Windows 10 cmd.exe prompt:

echo|set /p="Ù‼♪◙hello, world!!♪◙$┤○║☻☺═!1└═!">hello.com

After that you’ll get a 16-bit COM, hello.com, that will display the “hello, world!!” message. Funny 🙂

What are those weird characters?

First a little explanation. We want our hello.com file to be, byte after byte, an exact representation of the hexadecimal sequence above presented. We’ll use cmd.exe commands to dump characters into the file and, if we choose our characters carefully in order to match the target hexadecimal values, we’ll end up with the exact representation we’re looking for. For instance, the first 2 bytes block, EB 13, is the “jmp 115” instruction. Then comes the newline (0D 0A), and so on. If we convert our hexadecimal to decimal, we get:

235 19 13 10 104 101 108 108 111 44 32 119 111 114 108 100 
 33 33 13 10  36 180   9 186   2  1 205 33 180   0 205  33

The first byte in hello.com must be EB, or 235 in decimal. In order to dump our characters from the command line, we’ll convert that decimal value to a character. I’m trying this on a Windows 10 (64-bits) machine, with cmd.exe using Code page 850 Multilingual Latin 1. In such code page, character 235 is Ù. And 19 is ‼. And, luckily, 13 is ♪ and 10 is ◙. Those two characters are especially important because they represent the carriage return and the line feed, respectively, and some shells won’t convert them to characters. However, happily, cmd.exe with my default code page will handle them as we need. To input those characters you can type the usual ALT + decimal value.

There are a few important things to notice:

Read More
15 Jan

“Hello world” with DEBUG

Coding “Hello world” with DEBUG will be a blunt exercise on programming futility. Or an exercise on retro, old-school coding. More than two decades ago I used to code in x86 (Intel) assembly, almost daily. I remember the masochist approach to learning the opcodes and the hardware architecture. The famous RBIL (Ralf Brown’s Interrupt List) was, back then, my favorite “reference”. First painful steps were taken and first crashes happily followed. I remember trying to code, as expected, the traditional “hello, world!”, using a strange tool included in DOS, DEBUG.COM. I wrote a post about this “hello, world” with DEBUG.COM elsewhere, and yesterday I found the time to reread it: I verified, first with awe, then with horror, and finally, with relief, that I had almost completely forgotten how to code in assembly. So I’ll revisit this here, mostly as a self-imposed disciplinary measure, an exercise on programming, specifically, an exercise on programming futility. Heck, DEBUG isn’t even available on the Windows 10 machine I’m typing this on. However, DEBUG looked pretty cool back then: it could assemble, disassemble and dump hexadecimal output. You could create little programs, or inspect programs and peek memory areas.

Specifically what I want is to build a minimal “hello, world!” program using DEBUG.COM. I don’t have any use for this, but it comes as a “relaxing” post after several weeks focused on the release of “DragonScales 3: Eternal Prophecy of Darkness” on Steam and the localization of “DragonScales 5: The Frozen Tomb”. After we execute DEBUG.COM we’ll meet a prompt with a “-” symbol. Now we can input our commands. I want to assemble, i.e., I want to type assembly language instructions. The command for that is “a”, which might be optionally followed by a memory address. By default, instructions will be placed starting from CS:0100, so I’ll use that address. Equivalently, I could type “a 0100” or “a 100” to achieve the same result.

-a

Now we have to place the data in memory. For this little program I only need the characters for “hello, world!!”. Notice I want two “!!” at the end. That’s because I want the final program to occupy exactly 32 bytes; we’ll see the reason for this later on. I’ll use the pseudo-instruction “DB” to define our string. With DB I can neatly provide the string using ASCII values, like this:

 db "hello, world!!"

Those are 14 bytes. However, I want a prettier output, with a newline character before and after our string. A newline is in fact two characters: a carriage return (CR is ASCII 13) and a line feed (LF is ASCII 10). In hexadecimal, CR is 0Dh, and LF is 0Ah. OK. Now our DB would be modified to look like this:

 db 0d,0a,"hello, world",0d,0a

Those are 18 bytes. We are not done yet with our data. In order to effectively print the message to the standard output I’ll recur to the function 09h of INT 21h. Check RBIL D-2109. In short, I have to place the value 09h in register AH, and DS:DX should point to the beginning of our string. The function will print every character until finding a “$” character (i.e., “$” acts as the “zero” in null-terminated C strings). ASCII value of “$” is 36, or 24 in hexadecimal. Therefore, we modify our DB instruction again:

 db 0d,0a,"hello, world",0d,0a,"$"
Read More