In this blog I share my observations, thoughts and experience about computers, linguistics, philosophy and many other things that interest me.

Friday, January 16, 2026

asm359 enhancements

There's something deeply satisfying about building tools for a processor that doesn't quite exist yet. The GMS/359 project — a custom FPGA-based computer inspired by IBM's legendary System/360 — has been an exercise in both nostalgia and pragmatism. Today I want to share some thoughts on the assembler toolchain we've built for it.

Why System/360?

When IBM introduced the System/360 in 1964, they created something revolutionary: a family of compatible computers spanning a wide range of performance levels, all running the same software. The architecture introduced concepts we take for granted today — byte-addressable memory, a clean separation between I/O and computation, and a unified Program Status Word that elegantly captures machine state.

But the S/360 also carries sixty years of accumulated baggage. Big-endian byte order made sense when humans read hex dumps on teletypes. The BALR/USING dance for establishing base registers was clever but tedious. Putting the opcode at the end of variable-length instructions optimized for hardware that no longer exists.

GMS/359 keeps what's beautiful about S/360 — the channel I/O model, the clean instruction formats, the PSW concept — while quietly modernizing the rest. Little-endian bytes. Opcode-first encoding. PC-relative addressing. No more base register juggling.

The "/359" isn't a typo. It's a declaration: inspired by, not compatible with.

Design Decisions That Matter

Destination First, Always

One principle drove many syntax decisions: destination first, always. This matches the convention in x86, ARM, and RISC-V. But we applied it consistently to both loads and stores:

    L     R1, [addr]      ; R1 ← memory  (destination = register)
    ST    [addr], R2      ; memory ← R2  (destination = memory)

This differs from ARM and RISC-V, where stores put the register first. Our reasoning: the destination is what you're changing. For a load, you're changing the register. For a store, you're changing memory. Both follow the same mental model.

Three Flavors of Load Immediate

Loading constants into registers is perhaps the most common operation in assembly programming. S/360's approach — load from a literal pool, or use LA to load 12-bit addresses — feels clunky today.

We added three new instructions:

  • LI (Load Immediate) — 20-bit zero-extended constant
  • LIS (Load Immediate Signed) — 20-bit sign-extended constant
  • LFI (Load Full Immediate) — full 32-bit constant

The 20-bit instructions fit in 4 bytes and cover most practical cases. When you need the full 32-bit range, LFI is there at 6 bytes. More importantly, LFI supports relocations — you can write LFI R1, my_data_label and the linker fills in the actual address.

The Preprocessor: Standing on NASM's Shoulders

We could have implemented a minimal preprocessor. Instead, we borrowed NASM's design — one of the most capable macro systems in any assembler. The implementation spans about 4,500 lines across ten modules, handling:

  • Single and multi-line macros with parameters
  • Full conditional assembly (%if, %ifdef, %ifidn, etc.)
  • String operations (%strlen, %substr, %strcat)
  • Context stacks for local symbol scopes
  • Repeat blocks with early exit

Why this complexity? Because real-world assembly programming needs it. Generating lookup tables, unrolling loops, creating type-safe data structures — all become possible with a proper macro system.

The Toolchain Takes Shape

An assembler that only outputs raw bytes is useful for experiments but not for real programs. Modern development needs:

  • Separate compilation — assemble modules independently, link later
  • Symbol resolution — let the linker find external references
  • Relocations — generate position-independent code

We based our object format on RDOFF2 (Relocatable Dynamic Object File Format), originally designed for NASM. It's simple enough to understand in an afternoon but complete enough for real work.

The link359 linker can produce two outputs:

  1. Relocatable modules — for further linking or dynamic loading
  2. Flat binaries — for burning into ROM or loading at a fixed address

That second option, triggered with -O bin -A 0x200, is essential for IPL (Initial Program Load) code. The GMS/359 hardware loads code at address 0x200 and starts executing — no bootloader, no filesystem, just raw machine code.

Bugs Are Features in Disguise

The most interesting bugs reveal hidden assumptions. While implementing structure definitions (struc/endstruc), we discovered that our string tokenizer didn't handle EOF correctly after quoted strings. The fix was one line, but finding it required understanding the entire token flow.

Another bug appeared when assembling from a subdirectory: %include "header.g5h" would fail because we only searched the current directory and explicit include paths. The fix was to also search relative to the source file's directory — obvious in hindsight, invisible until you hit it.

What's Next

The assembler is now capable enough to write real programs — IPL code, device drivers, even a simple monitor. The next frontier is the hardware itself: getting the VHDL design to pass timing on the GateMate FPGA, further implementing the channel I/O subsystem, testing I/O devices (such as keyboard and PSRAM), and many other exciting things.

There's something profound about building both the hardware and the software tools. When you write LFI R1, 12345678h and watch it become c0 10 78 56 34 12 in the binary, you understand every bit. When that binary eventually runs on your custom processor, executing instructions you designed on hardware you built — that's when computing feels like craftsmanship again.

The tools are open source. The hardware designs will follow. If you're interested in retrocomputing, FPGA development, or just building things from first principles, follow along at the blog.

73 de QRV Systems


No comments: