So first things first… the multi-directional scrolling. Now chances are I started with the multiplexor really… but lets go with this first, as it’s the simplest.

Unlike consoles, you had to manually copy the screen to scroll, so it was pretty expensive to scroll C64 screens. On Blood Money, I used a double buffered screen for rendering, at locations $4000 and $4400. This let me copy the whole character map screen each frame to erase bullets, bombs, bomb explosions, turrets and character map sprites (like baddies and doors). This does come at a considerable cost, however unlike most C64 games, I ran at 25hz and not 50hz like most C64 games. This gave me some leeway, but not as much as you’d think.
The 2 hardware screens I have are alternated each game cycle, with the hidden one (the back buffer) being copied form 2 other game screens.

Each game screen is a list of 32 5×5 macro tiles. What I mean by that, is that I use 5×5, C64 8×8 tiles to make a what I call a macro tile. These are then used to build up the screens that I scroll. So for each 40×20 screen, it takes up just 32 bytes, then however many 5×5 tiles I use. The editor I wrote therefore lets you define both 5×5 tiles, and screens by selecting these tiles. I know… clear as mud.
So, when I first a level, I draw the first screen full of 5×5s, then copy this into the back buffer, and flip that to display. From then on, we just build strips as a new row of characters comes on.
Now so far, this is pretty simple in terms of C64 scrollers. You move the screen, you print a new row of characters at the edge. Blood Money’s multi-directional scrolling, makes this a little more complicated. Because I copy the whole screen each frame, and don’t “move” the back buffer screens – which literally involves copying the whole visible screen 1 character to the left. But, as I copy the screen, what I do is copy what consoles do – slowly. I call a copy function where I point to the top left of the screen, but that top left can move.

So above shows how I work this. A hardware screen is 40 characters wide, but my scrolling back buffer has space for an extra row or column of 5×5 tiles. So my software screen is 45×25 characters in size, while the visible screen is just 40×20 in size. This gives me that scroll buffer needed to draw a new row of tiles easily.
As to copying, The “X” register would be the index into that screen, then I copy a whole column of characters, decrement the value, then check to see if we’re past 44, and if so, reset to 0.
Though… as you can see from the source, I do this backwards, as it means one less compare. In 6502, when you decrement it automatically compares to 0, so that saves 40 compares – or 80 cycles, working out at just over a scan line! (One scanline on the C64 is about 65 cycles). That’s a big saving for just going backwards….

This code is dynamically updated – or “self modifying code” as we say. This again saves a huge number of cycles compared to using zero page and static code, and was a common trick back in the day

This is the code that sets up the big loop, and it simply copies the start address of each line into the copy loop. To aid in vertical scroll wrapping, I opt for doubling the table size – as it’s not very big, and this saves having to do any Y checking. This lets the Y address wrap without any real effort as the Y goes off the end of the first table <Screen+(ScreenLine*25) and moves to the next entry <Screen.

Now all that’s left, is how I swap scrolling directions. This is simply part of the level data, and an array of directions that we scroll for a whole screen.

As you can see, we have 32 screens of action, with 32 bytes per screen making about 1024 bytes of tile data.
Lastly… IRQs. We obviously have a raster split since we have a static panel at the bottom. This was a nightmare. Not just getting the split to remain in the right place, but to not flicker like a buggery. Aside for the remembering the pure “Pain” of doing this, I don’t remember the specifics other than that the first line of a character adds extra DMA fetch time, and that causes the number of cycles on a scanline to drop, and that means the CPU delay cycles I use to get a perfect split – changes. on top of this, when a sprite overlaps that line, you get even more cycles stolen. All this adds up to a raster split that flickers like a feker. I’m pretty sure it’s possible to do… but I didn’t have the time or patience to do it back then, which is a shame. Also, not all the DMA rules were known back then, but with some highly accurate emulators these days, we now probably know more than the creators of the machine!
Incidentally… if you’re interested in making a Blood Money style game on modern PC, I did start doing one of my Game In a Day challenges for Blood Money, but ran out of assets! You can see that here.
Next time… it gets juicy with the Hardware Sprite Multiplexor.