FizzBuzz Mario World: Learning Assembly Language & Having Some Fun

06.11.2020

I've wanted to learn Assembly language (ASM) programming for a long time. I finally found the perfect project to do it: hacking Super Mario World (SMW). It was a lot of fun so I thought I'd document the process.

The SMW ROM hacking community is vibrant. It's an impressively talented and creative community that makes a lot of awesome games with custom graphics, music, level design, and game physics. Stumbling upon these hacks on YouTube started me down this rabbit hole.

TLDR: the code and a list of the resources mentioned in this post can be found on GitHub.

FizzBuzz Mario

FizzBuzz is a common problem that beginner programmers solve for practice. The objective is to loop from 1 to 100 and:

The goal of this project is to solve a problem similar to FizzBuzz by writing custom ASM that can be patched, or inserted, into the SMW code. The perfect context for FizzBuzz in SMW is the coin count. At any one time the player can have between 0 and 99 coins. Additionally, Mario can have 1 of 4 power-up statuses: small, big, cape, and fire. Here's the behavior we'll hack into our version of SMW:

Here's game-play of the finished product:

Getting Started

After a few minutes of searching the web I found the source of most of the information I'll be sharing: SWM Central. You can find a full list of resources at the end of this article but the guides I found most helpful were Ersanio's Assembly for the SNES and Assembly for Super Mario World. The former assumes no knowledge of ASM. The latter refreshes the ASM information from the former and then goes into how to apply a simple patch similar to the one we'll be writing. Both are quick reads and I recommend them to any programmer that wants to know what ASM programming feels like.

Memory

ASM is a low-level language with which you deal directly with individual bytes of memory. SNES games are comprised of read-only memory (ROM) and random-access memory (RAM). You can think of the ROM as the game cartridge itself. Instead of an actual cartridge and SNES, the ROM for present purposes is a computer file that you can play on a SNES emulator. The ROM is where all the code for how the game works is stored. This memory, as the name implies, is usually only ever read from. Our goal is to hack this ROM by overwriting a small part of it thus changing how the game behaves.

The RAM is on the SNES itself and it's where values are stored that will change while the game is played. The coin counter value needs to live in RAM because it changes often. The same goes for the player's power-up status.

In both ROM and RAM, memory is a long list of addresses where values can be stored. These addresses are represented in hexadecimal (hex). For our purposes we'll need to figure out the memory addresses for three things: the value of the coin count, the value of the power-up status, and where to insert some custom code.

Fortunately, people have completely disassembled SMW and mapped the RAM and ROM, so finding what we need is a simple web search. The RAM map can be found on SMW Central. The RAM begins at address $7E0000 and ends at $7FC800. The $ indicates hex. By searching a few relevant keywords I found that the coin count is stored at address $7E0019 and the power-up status is stored at address $7E0DBF. The entry for power-up status also indicates the 4 possible values for this address and what they mean: 1 for big Mario, 2 for cape Mario, 3 for fire Mario, and 0 for small Mario.

This gives us a basic idea of what our ASM code will need to do, in pseudocode:

every time the coin count increases:
  get the value of RAM address $7E0019 (coin count)
  if that value is divisible by 15
    store 3 in RAM address $7E0DBF (power-up status)
  else if that value is divisible by 5
    store 2 in RAM address $7E0DBF
  else if that value is divisible by 3
    store 1 in RAM address $7E0DBF
  else store 0 in RAM address $7E0DBF

Now we need to figure out where to insert our code. I want this code to run whenever the player gets a coin so I searched for "coin count" within the ROM map and found $008F1D, a 30 byte piece of code that "handles actually increasing the player's coin count and giving a life from 100 coins." This is a good start, but we can't just insert code into 30 bytes of ROM without seeing what it does. We will break everything if we're not careful. Unfortunately the ROM map on SWM Central doesn't have the actual code stored at this address. But then I found All.log++: a complete disassembly of the SMW source code, in ASM, with extensive comments and labels.

Inspecting the disassembled SMW source code at this coin count code address, I noticed that the address where the coin count is actually increased is $008F25. That is, this is the memory address of ROM that stores the instruction that literally adds 1 to the coin count every single time the player receives 1 coin. But instead of just increasing the coin count by 1, I want to patch the ROM so that when the code at this address gets executed, the SNES also runs my FizzBuzz code.

Now we have all of the relevant memory addresses that we need, and we have the pseudocode that we need to run at the coin count increase ROM address. Now we assemble.

Writing the Code

Our first two lines are easy:

!PowerUpStatus = $0019
!CoinCount = $0DBF

We simply set the RAM memory addresses that we need for the coin count and power-up status to some labels so they'll be easier to refer to in the code. Notice that we dropped the 7E from both addresses. We don't need it. These lines don't actually get patched into the ROM. All occurrences of the labels in the code that we will write will get replaced with the addresses by the assembler. The assembler is the thing that will take our code and insert it into the ROM.

Hijacking the Coin Count Increase Code

We know where we want to insert our code: the ROM address where the coin count is increased. But we can't insert all of our code into this address, we'll overwrite a lot of stuff and break the game. We can only insert a few bytes, and we have to make sure the bytes that we overwrite are executed by us in our own code so that everything that the original code was supposed to do still happens. So what we'll do is insert one instruction in the ROM at the coin count increase address that tells the processor to jump to the rest of our code. Then we'll tell the assembler to insert our code in some free space so we don't overwrite anything. Here's what that looks like:

ORG $008F25
JSL FizzBuzz
NOP
NOP

freecode

ORG $008F25 instructs the assembler to insert the following instruction, JSL FizzBuzz, into the ROM address where the coin count is increased. What JSL FizzBuzz means is: Jump to the Subroutine code labeled FizzBuzz. Most ASM operation codes, or opcodes, are menumonic. The J and S are for Jump and Subroutine. We can ignore the L, it's beyond the scope of this.

What's with the NOPs? As it turns out, the code we've chosen to overwrite, the coin count increase instruction, is 3 bytes long (we know this because All.log++ tells us that). The code we insert, JSL FizzBuzz, is 4 bytes long. So in our attempt to overwrite 1 3 byte long instruction, we overwrote the first byte of a second instruction.

The fix for this is that we need remember to execute both instructions that we overwrote in the code we write. The second instruction we overwrite, like the first, is 3 bytes long, for a total of 6 bytes. But again, the code we inserted is only 4 bytes: there are still 2 dangling bytes that were previously part of that second instruction. That's not good. Random, partially overwritten bytes in the code will break the game. So, we include 2 NOP instructions. These are No OPerations. We do nothing for 2 bytes to fill up the space not filled up by the previous 4 bytes of code we inserted. To recap, we overwrote 6 bytes of 2 instructions with 6 bytes of our own instruction that jumps to our custom code.

freecode just means find some free space in the ROM to put the rest of our custom ASM code.

The FizzBuzz Subroutine

The following code is the beginning of the FizzBuzz subroutine that we referenced above.

FizzBuzz:
INC !CoinCount
LDA #$0F
STA $00
LDA !CoinCount
JSR Mod

The first line is the label. The second, INC !CoinCount, means INCrement the value in the memory address for the coin count by 1. This is what the first instruction of the code we overwrote was supposed to do, so we do it here ourselves.

In ASM, one of the most important things is the accumulator. Essentially, the accumulator (A) is the memory address where the microprocessor stores its results from math and logic operations. We can also store stuff there for use in math operations. LDA #$0F does just that. It LoaDs into the Accumulator the value 15. 0F is hex for the decimal value 15, and the # means we want the value 15 itself, not what's stored at the memory address 15.

We then run STA $00, this STores the value of the Accumulator into memory address $00. This memory address is "scratch" memory that has no assigned purpose other than as a place to store temporary values. We couldn't just store a value directly into $00, instead it was a two-step process: load a value into A, then store the value of A in $00. Next, LDA !CoinCount loads the value of the coin count into A.

So now we have two values stored in memory: the coin count, stored in A, and 15, stored in $00. To check if something is divisible by 3 and 5, we can divide it by 15 and check if there's a remainder. Many programming languages have a modulo operator that gives you the remainder of 1 number divided by another. For example, 47 modulo 15 is 2. JSR Mod means Jump to the SubRoutine labeled Mod. ASM for the 65c816 microprocessor doesn't have a modulo operator so we have to write it ourselves. Its explanation is more complicated than the rest and I don't want it to distract us at the moment so you can find it at the end of the post. JSR Mod will find the remainder of the coin count divided by the value in $00, 15. Importantly, it will store that remainder in A. Then we execute the following code:

BNE TestMod5
LDA #$03
STA !PowerUpStatus
BRA Return

BNE TestMod5 means Branch to TestMod5 if Not Equal. It checks if A is equal to 0, and if it isn't, it branches to TestMod5 thus skipping the other three instructions in that code snippet. When would A equal 0? When the coin count divided by 15 has 0 as a remainder. If A is 0, then the code won't branch. Instead, LDA #$03 loads the value 3 into A, and then STA !PowerUpStatus stores the value of A into the power-up status memory address. 3 is the value for fire Mario.

We've just coded the logic so that when the player gets a coin, and the coin count is divisible by 3 & 5, we set Mario to fire Mario. And if it was divisible by 3 & 5, we're done! So we execute BRA Return which means BRAnch to Return. We'll see what that does later. If A did not equal 0, then the coin count was not divisible by 3 and 5, so we didn't set Mario to fire Mario. That means we need to test the next case, whether ot not the coin count is divisible by 5. The following code is where the branch above goes if A mod 15 doesn't equal 0:

TestMod5:
LDA #$05
STA $00
LDA !CoinCount
JSR Mod
BNE TestMod3
LDA #$02
STA !PowerUpStatus
BRA Return

I leave it to the reader to work out what's happening here step by step. There's nothing new. We simply repeat what we did above, but for 5. If the coin count mod 5 is not 0, we branch again to test 3:

TestMod3:
LDA #$03
STA $00
LDA !CoinCount
JSR Mod
BNE SetSmall
LDA #$01
STA !PowerUpStatus
BRA Return
SetSmall:
STZ !PowerUpStatus

There's some more of the same here for 3. But when the coin count mod 3 doesn't equal 0 we branch to SetSmall. This code, STZ !PowerUpStatus STores Zero in the power-up status address. We only reach that code if the coin count wasn't divisible by 3, 5, or 15, so we set Mario to small. If any of those cases set the power-up status, it would've jumped all the way to the Return subroutine:

Return:
LDA !CoinCount
RTL

These instructions do two things. First it loads the coin count into A. This is the second instruction that our inserted code overwrote. Recall at the very beginning we inserted a jump to our code into the memory location where the coin count is increased. We overwrote two instructions and LDA !CoinCount was that second instruction, so we do it here before we return to where we inserted our code. Then, we RTL which means Return to wherever we JTLed from.

The Mod Subroutine

ASM, at least for the 65c816, doesn't have many opcodes. It doesn't even have multiplication and division, let alone the modulo operator. Writing a modulo subroutine in any other language is simple. If you want to know what x modulo y is, subtract y from x, then check if x is less than 0. If it is, add y back to x and that's your remainder. If it isn't, repeat the subtraction and ask again. Like so:

18 modulo 7

18 - 7 = 11
11 - 7 = 4
4 - 7 = -3

-3 is less than 0, so the remainder is -3 + 7, 4

Here is the ASM:

Mod:
SEC
SBC $00
BCS Mod
ADC $00
RTS

The key to understanding this is understanding how something called the carry flag works. Admittedly, I know just enough about the carry flag to get this to work, so my explanation is not complete. But what I do know is that if the carry flag is set to 1, and then you subtract a number from the accumulator that produces a negative result, then the carry flag is set to 0. This is good, our modulo algorithm required us to know when we arrived at a negative number.

The first thing we do, SEC, is SEt the Carry flag (set it equal to 1). Then we SuBtract with Carry the value stored in $00 from A. Remember the code above when we kept putting the coin count in A and either 3, 5, or 15 in $00. This is why.

Now that we've subtracted the value in $00 from A, we execute BCS Mod. This means Branch if Carry is Set to Mod. If the subtraction we did resulted in a positive number, then the carry flag is still set, so we go back and do it all again. If the carry flag isn't set, then we know that A is now a negative number, so we continue on. The next thing we do is ADC $00, which means ADd with Carry to A. We add the value in $00 to A. What this means is that now the value of A, the accumulator, is the remainder of A initially (the coin count) modulo $00 (15, 5, or 3). And finally, we RTS, or ReTurn from Subroutine and go back to where we jumped from. There we go. Modulo in assembly in Super Mario World.

Conclusion

This project was a lot of fun, and it was the introduction to ASM that I always wanted, without even knowing it. I don't know if I'll have time to continue hacking SMW, but I may try to find other avenues to explore ASM, like embedded programming.

As it turns out, this small change of behavior that we patched into SMW is actually a lot of fun! Unfortunately, I can't disseminate the hacked ROM, but if you'd like to play it, please shoot me an email: vgabruzzo@gmail.com, or simply google around about how to patch the SMW ROM yourself with the code we've written.

Resources

Playing SNES ROMs

SMW Central: source for all things SMW
SNES9X: Emulator for Playing SNES ROMs

Programming

Ersanio's Assembly for the SNES
Ersanio's Assembly for Super Mario World
SMW RAM & ROM Maps
All.log++ Annotated SMW ROM Map
SNES Assembler Asar