Re-implementing an old DOS game in C++ 17

Back in 2016, I started a side project to reverse engineer the game Duke Nukem II and build an open source reimplementation of its engine from scratch – called Rigel Engine (Check it out on GitHub). Now, more than 2 1/2 years later, my version is complete enough to allow playing the entire shareware episode of the original game, offering a practically identical experience to running the original. Here’s a video showing off the first level:

So what can it do? Rigel Engine works as a drop-in replacement for the original DOS binary (NUKEM2.EXE). You can place it into the game directory and it will read all the data from there, or you can specify the path to the game data as a command-line argument. It builds and runs on Windows, Mac OS X, and Linux. Based on SDL and OpenGL 3/OpenGL ES 2, written in C++ 17.

It implements the game logic for all the enemies and game mechanics found in the Shareware episode, plus most of the menu system. Saved games and high scores from the original game can also be imported.

On top of that, it already offers some enhancements compared to running the original game:

  • No emulator or vintage hardware required, no need to tweak settings
  • No loading screens – hit enter in the “new game” menu and you’re immediately in the action
  • Multiple sound effects can play at the same time, which is not possible in the original
  • No limitations on the number of simultaneous particle effects, explosions etc. going on
  • Per-user save files and high score lists
  • Much more responsive menus

Now, I don’t consider Rigel Engine fully “done” yet. But this is a nice milestone for sure, and a good opportunity to write about the project again (find some older posts here and here). Let’s start by taking stock of what’s in the code right now, and how I got there.

How much code is it?

At the time of this writing, RigelEngine consists of 270 source files containing over 25k lines of code (without comments/blank lines). Of those, 10 files and 2.5k lines are unit tests. A breakdown including blank lines and comments can be found here.

What’s in all that code? There’s a bit of general infrastructure and utilities, we have fundamentals like rendering, and a lot of smaller pieces of logic here and there. On top of that, some of the bigger chunks are:

  • parsers/loaders for 14 different file formats used by the original game – 2k LOC
  • behavior/game logic for 24 enemies/hostile objects – 3.8k LOC
  • game logic for 14 interactive elements and game mechanics – 2k LOC
  • the player control logic – 1.2k LOC
  • 154 configuration entries (how much health does this enemy have, how much points does this collectable give etc.) – 1k LOC
  • 31 destruction effect specifications (effects triggered when an enemy or other destructible object is destroyed) – 254 LOC
  • the camera control code – 159 LOC
  • interpreter for the game’s menu/cut scene description language – 643 LOC
  • HUD and other UI code 818 LOC
  • 5 non-menu screens/modes, e.g. the intro movie, bonus screen etc. – 789 LOC

Of course, all of this code had to be written, which brings us to the next part.

How much work was it?

Although it’s been 2 and a half years since I started the project, I didn’t always work on it during this time. There were a couple of months where I didn’t spend any time on the project, and others where I only put a few hours into it. Then, there have also been times where I worked quite extensively on Rigel Engine. Looking at the commit chart from Github gives us a rough idea of how my efforts were distributed in time:

GH-Screen

What we see in that chart are 1081 commits to the master branch. Before creating the repository though, I was working in a private one which features another 247 commits, thus giving us 1328 commits in total. In addition to that, there were various prototype branches I used to explore and experiment, but never merged, and I sometimes squashed larger commit histories down to a more condensed form before merging.

Now, writing code was only one part of the project – reverse-engineering being the other major one. I’ve spent quite a few hours looking at the original executable’s disassembly in Ida Pro (the free version), taking notes, writing down pseudocode, and planning out how to implement things in my version. I also did a lot of testing with the original game, running it in DOSBox and on original hardware (various 386 and 486 machines which I got from eBay). I built test levels for focused observation of specific enemies and game mechanics, recorded video captures of these using DOSBox, and stepped through the recordings frame by frame to verify my findings from reading assembly code. Once I had implemented an enemy or game mechanic, I would also typically record footage from my version, and compare it to the original frame by frame to verify the accuracy of my implementation.

A few pictures from some of my notes (click to enlarge):

IMG_20190504_144556.jpg
Reverse-engineering the camera control code. The large rectangle represents the screen. The dotted lines indicate zones in which the player can move without the camera following. If you’re curious, the actual camera control code can be found here.
IMG_20190504_144655.jpg
General notes to help with understanding assembly code. Left side is the original game’s update order on a high level. Right side is notes about a bit-field representing some game object state.
IMG_20190504_144622.jpg
Transcription of assembly into pseudo code. I usually did this in a fairly mechanical way, transcribing assembly without thinking too much about what the code is doing, and then using the pseudo code version to get an understanding of the underlying logic. Based on that, I’d then derive my implementation. See the final code here.

 

IMG_20190504_144602.jpg
Pseudo-code for the cleaned up version of an enemy’s logic. The captions represent states in the state machine, the code below indicates what should happen in the respective states. This was derived from the raw pseudo-code resulting from transcribing assembly. You can find the final code here.

Ultimately, doing this was quite a lot of fun, and I learned a lot: About reverse engineering, 16-bit x86 assembly, low-level VGA programming, the strict limitations PC game developers in the early 90s had to face, but also a lot of insights into how the original game works and how quirky and odd some of it was implemented – that’s actually worth a series of dedicated blog posts at some point.

What’s next?

Besides adding the last missing features and finishing support for the registered version, I have quite a few ideas for enhancing and improving Rigel Engine, not to mention a whole lot of cleanups and refactoring that could be done – as usual, the best way to architect a piece of software becomes most evident once you’re done writing said piece of software 🙂

In terms of future enhancements, here are some of the things I’ve thought about doing:

  • Smooth motion with interpolation. The game only updates its logic roughly 15 times a second, and in the original version, that’s also the frame rate for rendering. Rigel Engine, on the other hand, can easily run at 60 FPS or higher. At the moment, these additional frames don’t really offer any benefits, but I can imagine using them for in-between frames to make for smoother scrolling and object motion. The game logic would still run at its original speed, but objects would move smoothly instead of “jumping” by 8-pixel increments as they do now. I’ve prototyped this in the past and it looks great, but needs a bit more work.
  • Gamepad support. The original game features joystick support, and DosBox can emulate this for a modern game pad, but it can be a bit cumbersome to set up, requiring configuration tweaks and calibration in game. Not to mention that not all controller buttons are supported, and menus often still require using the keyboard. So I believe that native controller support would make for a much better experience.
  • Enhanced audio. At the moment, sound effects always have the same volume. Objects that produce sound, e.g. force fields, will abruptly become audible once they appear on screen, and vanish just as abruptly. I was wondering how it would sound if these effects would weaken with distance instead. So you would faintly hear the force field even when it’s not on screen, and it would get louder as you get closer.
  • Zoomed out view/showing more of the level at once. The game wasn’t made for this, so it might hurt the experience somewhat as you’d start to see enemies being inactive while off screen etc. But I’m still interested to see how it would look and feel. Not being able to see enough of the surroundings is a common complaint about the game, after all. Having an option to disable the HUD or replace it with a more minimal one making use of transparency could also be interesting.
  • Replacing graphics with higher-resolution versions. This is a common feature in many source ports/recreations of games, and would be nice to see here as well. It’s already possible to override Sprite graphics with custom images, but they can’t be higher resolution at the moment, since everything is rendered to a small internal buffer and then upscaled. This first needs to change so that up-scaling happens per object.

I don’t really have a set roadmap for the future, so I might do any of these things whenever I feel like it. Before all that though, the next step is integrating Dear ImGui in order to then build an options menu – which is still a missing feature in the game in general, but also necessary so that it’s possible to enable and disable the enhancements listed above. Finally, of course, contributions on GitHub are always welcome!

Advertisements

10 thoughts on “Re-implementing an old DOS game in C++ 17

    1. Unfortunately not, the original source has been lost according to an interview with the developers. But if you look at other games from the same area, Cosmo’s Cosmic Adventure has around 25k LOC and Commander Keen has around 35 IIRC, so I’d assume the original Duke code is in a similar ballpark

      1. It seems a bit disappointing if the 17 implementation would not be sufficiently less source code. If it’s similar, what’s the reason for that given that 17 is significantly more expressive?

      2. It’s a fair assumption that a modern implementation should be much less code, but there are a few reasons why that’s not the case:

        * The file formats for graphics and sound are optimized to be more or less directly copied from disk into hardware-mapped memory regions, there is practically no conversion or parsing going on in the original code. My version, on the other hand, has to convert the data into formats usable by modern video and audio APIs, i.e. 32-bit RGBA color images and 16-bit PCM audio buffers. My version also does this in an endian-independent way, and adds quite a few correctness checks on top to reject corrupt data in an orderly fashion instead of triggering undefined behavior, none of which happens in the original code.
        * Similarly, since the original is a DOS application, it needs very little code to setup a fullscreen video mode. The higher level of abstraction provided by a modern OS means it’s a bit more work to set things up, and interacting with the GPU is also more code. Whereas the original can basically do a memcpy to show something on screen, I need 1000 lines of OpenGL code… Abstraction is great, but it comes at a cost, too.
        * We don’t know how the original code looks like, but given other typical code from this era, it seems safe to assume that it likely involves a lot of pointer arithmetic and casting etc., and isn’t very type-safe. In RigelEngine, on the other hand, I try to be fairly type-safe, which sometimes means needing more code, additional data transformations etc.
        * The original uses a lot of global variables, tends to reuse the same variables for different purposes, and often has unrelated behavior hacked into existing functions. My aim was to be much cleaner, so I don’t have global variables, but this in turn means more boilerplate to inject dependencies into objects on construction etc., better separation of concerns also sometimes means a bit more code etc.

    2. I agree it would be interesting to compare, even if I fear the original game probably have a lot of assembly code…

  1. Along with graphic asset replacement, I’d also be interested in audio and music asset replacement. For music my fantasy is to see the whole spectrum supported: IMF, MID/HMI, MOD/XM/IT/S3M/etc, WAV, OGG/FLAC (with metadata loop tags for seamless looping, see EDuke32’s implementation of that). Keep up the good work!

  2. Heyya awesome work. Happy to help with the upscaled graphics. Would be keen to contribute on that front.

    1. Thanks 🙂 Cool, that’d be much appreciated. Feel free to contact me at “my user name 128 AT web DOT de” and we can figure something out.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s