[DevoxxFR 2024] Super Tech’Rex World: The Assembler Strikes Back
Nicolas Grohmann, a developer at Sopra Steria, took attendees on a nostalgic journey through low-level programming with his talk, “Super Tech’Rex World: The Assembler Strikes Back.” Over five years, Nicolas modified Super Mario World, a 1990 Super Nintendo Entertainment System (SNES) game coded in assembler, transforming it into a custom adventure featuring a dinosaur named T-Rex. Through live coding and engaging storytelling, he demystified assembler, revealing its principles and practical applications. His session illuminated the inner workings of 1990s consoles while showcasing assembler’s relevance to modern computing.
A Retro Quest Begins
Nicolas opened with a personal anecdote, recounting how his project began in 2018, before Sopra Steria’s Tech Me Up community formed in 2021. He described this period as the “Stone Age” of his journey, marked by trial and error. His goal was to hack Super Mario World, a beloved SNES title, replacing Mario with T-Rex, coins with pixels (a Sopra Steria internal currency), and mushrooms with certifications that boost strength. Enemies became “pirates,” symbolizing digital adversaries.
To set the stage, Nicolas showcased the SNES, a 1990s console with a CPU, ROM, and RAM—components familiar to modern developers. He launched an emulator to demonstrate Super Mario World, highlighting its mechanics: jumping, collecting items, and battling enemies. A modified ROM revealed his custom version, where T-Rex navigated a reimagined world. This demo captivated the audience, blending nostalgia with technical ambition.
For the first two years, Nicolas relied on community tools to tweak graphics and levels, such as replacing Mario’s sprite with T-Rex. However, as a developer, he yearned to contribute original code, prompting him to learn assembler. This shift marked the “Age of Discoveries,” where he tackled the language’s core concepts: machine code, registers, and memory addressing.
Decoding Assembler’s Foundations
Nicolas introduced assembler’s essentials, starting with machine code, the binary language of 0s and 1s that CPUs understand. Grouped into 8-bit bytes (octets), a SNES ROM comprises 1–4 megabytes of such code. He clarified binary and hexadecimal systems, noting that hexadecimal (0–9, A–F) compacts binary for readability. For example, 15 in decimal is 1111 in binary and 0F in hexadecimal, while 255 (all 1s in a byte) is FF.
Next, he explored registers, small memory locations within the CPU, akin to global variables. The accumulator, a key register, stores a single octet for operations, while the program counter tracks the next instruction’s address. These registers enable precise control over a program’s execution.
Memory addressing, Nicolas’s favorite concept, likens SNES memory to a city. Each octet resides in a “house” (address 00–FF), within a “street” (page 00–FF), in a “neighborhood” (bank 00–FF). This structure yields 16 megabytes of addressable memory. Addressing modes—long (full address), absolute (bank preset), and direct page (bank and page preset)—optimize code efficiency. Direct page, limited to 256 addresses, is ideal for game variables, streamlining operations.
Assembler, Nicolas clarified, isn’t a single language but a family of instruction sets tailored to CPU types. Opcodes, mnemonic instructions like LDA (load accumulator) and STA (store accumulator), translate to machine code (e.g., LDA becomes A5 for direct page). These opcodes, combined with addressing modes, form the backbone of assembler programming.
Live Coding: Empowering T-Rex
Nicolas transitioned to live coding, demonstrating assembler’s practical application. His goal: make T-Rex invincible and alter gameplay to challenge pirates. Using Super Mario World’s memory map, a community-curated resource, he targeted address 7E0019, which tracks the player’s state (0 for small, 1 for large). By writing LDA #$01
(load 1) and STA $19
(store to 7E0019), he ensured T-Rex remained large, immune to damage. The #
denotes an immediate value, distinguishing it from an address.
To nerf T-Rex’s jump, Nicolas manipulated controller inputs at addresses 7E0015 and 7E0016, which store button states as bitmasks (e.g., the leftmost bit for button B, used for jumping). Using LDA $15
and AND #$7F
(bitwise AND with 01111111), he cleared the B button’s bit, disabling jumps while preserving other controls. He applied this to both addresses, ensuring consistency.
To restore button B for firing projectiles, Nicolas used 7E0016, which flags buttons pressed in a single frame. With LDA $16
, AND #$80
(isolating B’s bit), and BEQ
(branch if zero to skip firing), he ensured projectiles spawned only on B’s press. A JSL
(jump to subroutine long) invoked a community routine to spawn a custom sprite—a projectile that moves right and destroys enemies.
These demos showcased assembler’s precision, leveraging memory maps and opcodes to reshape gameplay. Nicolas’s iterative approach—testing, tweaking, and re-running—mirrored real-world debugging.
Mastering the Craft: Hooks and the Stack
Reflecting on 2021, the “Modern Age,” Nicolas shared how he mastered code insertion. Since modifying Super Mario World’s original ROM risks corruption, he used hooks—redirects to free memory spaces. A tool inserts custom code at an address like $A00, replacing a segment (e.g., four octets) with a JSL
(jump subroutine long) to a hook. The hook preserves original code, jumps to the custom code via JML
(jump long), and returns with RTL
(return long), seamlessly integrating modifications.
The stack, a RAM region for temporary data, proved crucial. Managed by a stack pointer register, it supports opcodes like PHA
(push accumulator) and PLA
(pull accumulator). JSL
pushes the return address before jumping, and RTL
pops it, ensuring correct returns. This mechanism enabled complex routines without disrupting the game’s flow.
Nicolas introduced index registers X and Y, which support opcodes like LDX
and STX
. Indexed addressing (e.g., LDA $00,X
) adds X’s value to an address, enabling dynamic memory access. For example, setting X to 2 and using LDA $00,X
accesses address $02.
Conquering the Game and Beyond
In a final demo, Nicolas teleported T-Rex to the game’s credits by checking sprite states. Address 7E14C8 and the next 11 addresses track 12 sprite slots (0 for empty). Using X as a counter, he looped through LDA $14C8,X
, branching with BNE
(branch if not zero) if a sprite exists, or decrementing X with DEX
and looping with BPL
(branch if positive). If all slots are empty, a JSR
(jump subroutine) triggers the credits, ending the game.
Nicolas concluded with reflections on his five-year journey, likening assembler to a steep but rewarding climb. His game, nearing release on the Super Mario World hacking community’s site, features space battles and a 3D boss, pushing SNES limits. He urged developers to embrace challenging learning paths, emphasizing that persistence yields profound satisfaction.
Links:
Hashtags: #Assembler #DevoxxFrance #SuperNintendo #RetroGaming #SopraSteria #LowLevelProgramming