I decided to run a little experiment to see if I could assemble, link and run a 16-bit executable on Windows 10. This post documents my experience.
Why a 16-bit executable? Well, I wanted the smallest possible “Hello World” executable for use when introducing students to IDA Pro.
Here is the assembly code listing. It is MASM-specific. You will have to modify the code if you want it to build using NASM, FASM or GASM toolchains.
TITLE HelloWorld1 (HW1) .MODEL small .STACK 100h .DATA MESSAGE DB "Hello world! 16-bit DOS assembly here!", 0Dh, 0Ah, "$" .CODE main PROC mov ax, @DATA mov ds, ax lea dx, MESSAGE mov ah, 9 int 21h mov ax, 4C00h int 21h main ENDP END main
Here I use MASM 6.14 to assemble and link the above assembly code:
The output executable is named hw1.exe.
Next I attempted to use MASM 7.10 to assemble and link the same assembly code:
As you can see, MASM 7.10 returns two errors, i.e. “undefined DGROUP” and “leading underscore required for start address”. There are a couple of ways of overcoming the errors and I choose the easiest method, i.e. assembling with the /omf switch.
Early versions of MASM generated OMF-format object modules, while PE/COFF is generated by default by MASM 7.1 and later. According to Wikipedia:
The Relocatable Object Module Format (OMF) is an object file format used primarily for software intended to run on Intel 80×86 microprocessors. It was originally developed by Intel[when?] under the name Object Module Format, and is perhaps best known to DOS users as an .OBJ file.
Here is what was is displayed when I attempted to execute hw1.exe from the command prompt in Windows 10:
Note the GUI modal pop-up message box; the command prompt window is nowadays actually a full GUI application.
Similar situation here, but a non-modal pop-up message was displayed:
The workaround was to run hw1.exe in DOSBox:
Putting on my reverser hat, my first question was would the SysInternals strings utility handle 16-bit executables. Well, you can see the results for yourself below.
The MZ characters are the “magic string” that denotes a MS-DOS executable. These two characters are the initials of Mark Zbikowski who was the designer of the original MS-DOS executable file format.
The next question was would IDA Pro handle my 16-bit executable? It turns out that it could.
Here is the full disassembly listing produced by IDA Pro 6.9:
seg000:0000 ; seg000:0000 ; +-------------------------------------------------------------------------+ seg000:0000 ; | This file has been generated by The Interactive Disassembler (IDA) | seg000:0000 ; | Copyright (c) 2015 Hex-Rays,| seg000:0000 ; | License info: XXXXXXXXXXXXXXX | seg000:0000 ; | Finnbarr Murphy, personal license | seg000:0000 ; +-------------------------------------------------------------------------+ seg000:0000 ; seg000:0000 ; Input SHA256 : B6CA752F014E2B0BAF0D8A82D4D163CAD73BD9C183E764CA0010B4CA999F7164 seg000:0000 ; Input MD5 : C76EEFAE7E1FD7EC03690EFFA9CFAE50 seg000:0000 ; Input CRC32 : DCC0D305 seg000:0000 seg000:0000 ; File Name : C:\Users\fpm\Desktop\hw1.exe seg000:0000 ; Format : MS-DOS executable (EXE) seg000:0000 ; Base Address: 1000h Range: 10000h-10140h Loaded length: 3Bh seg000:0000 ; Entry Point : 1000:0 seg000:0000 seg000:0000 .686p seg000:0000 .mmx seg000:0000 .model large seg000:0000 seg000:0000 ; =========================================================================== seg000:0000 seg000:0000 ; Segment type: Pure code seg000:0000 seg000 segment byte public 'CODE' use16 seg000:0000 assume cs:seg000 seg000:0000 assume es:nothing, ss:seg002, ds:nothing, fs:nothing, gs:nothing seg000:0000 seg000:0000 ; =============== S U B R O U T I N E ======================================= seg000:0000 seg000:0000 ; Attributes: noreturn seg000:0000 seg000:0000 public start seg000:0000 start proc near seg000:0000 mov ax, seg dseg seg000:0003 mov ds, ax seg000:0005 assume ds:dseg seg000:0005 lea dx, unk_10012 seg000:0009 mov ah, 9 seg000:000B int 21h ; DOS - PRINT STRING seg000:000B ; DS:DX -> string terminated by "$" seg000:000D mov ax, 4C00h seg000:0010 int 21h ; DOS - 2+ - QUIT WITH EXIT CODE (EXIT) seg000:0010 start endp ; AL = exit code seg000:0010 seg000:0010 seg000 ends seg000:0010 dseg:0002 ; =========================================================================== dseg:0002 dseg:0002 ; Segment type: Pure data dseg:0002 dseg segment para public 'DATA' use16 dseg:0002 assume cs:dseg dseg:0002 ;org 2 dseg:0002 unk_10012 db 48h ; H ; DATA XREF: start+5o dseg:0003 db 65h ; e dseg:0004 db 6Ch ; l dseg:0005 db 6Ch ; l dseg:0006 db 6Fh ; o dseg:0007 db 20h dseg:0008 db 77h ; w dseg:0009 db 6Fh ; o dseg:000A db 72h ; r dseg:000B db 6Ch ; l dseg:000C db 64h ; d dseg:000D db 21h ; ! dseg:000E db 20h dseg:000F db 31h ; 1 dseg:0010 db 36h ; 6 dseg:0011 db 2Dh ; - dseg:0012 db 62h ; b dseg:0013 db 69h ; i dseg:0014 db 74h ; t dseg:0015 db 20h dseg:0016 db 44h ; D dseg:0017 db 4Fh ; O dseg:0018 db 53h ; S dseg:0019 db 20h dseg:001A db 61h ; a dseg:001B db 73h ; s dseg:001C db 73h ; s dseg:001D db 65h ; e dseg:001E db 6Dh ; m dseg:001F db 62h ; b dseg:0020 db 6Ch ; l dseg:0021 db 79h ; y dseg:0022 db 20h dseg:0023 db 68h ; h dseg:0024 db 65h ; e dseg:0025 db 72h ; r dseg:0026 db 65h ; e dseg:0027 db 21h ; ! dseg:0028 db 0Dh dseg:0029 db 0Ah dseg:002A db 24h ; $ dseg:002B db ? ; dseg:002C db ? ; dseg:002D db ? ; dseg:002E db ? ; dseg:002F db ? ; dseg:002F dseg ends dseg:002F seg002:0000 ; =========================================================================== seg002:0000 seg002:0000 ; Segment type: Uninitialized seg002:0000 seg002 segment byte stack 'STACK' use16 seg002:0000 assume cs:seg002 seg002:0000 assume es:nothing, ss:nothing, ds:dseg, fs:nothing, gs:nothing seg002:0000 db 100h dup(?) seg002:0000 seg002 ends seg002:0000 seg002:0000 seg002:0000 end start
The only thing that IDA Pro got wrong was the memory model, large instead of small. A trivial detail.
By the way, this is the smallest executable I have ever reversed using IDA Pro. Also the smallest disassembly listing that I have seen IDA Pro produce.
Enjoy!