Disassembling the Triad T2556: Introduction
So the machine is (more or less) back together, we have an understanding of how all of the components connect, and its time to take a deeper dive into the software that the machine runs. Perhaps now is the time to learn what the beep-boop sounds mean?
In the next article, I’ll talk a little bit about what I learned by spelunking through the first kilobyte or so of program space — enough to get an understanding of the start-up and self-test code. I wanted to know at least enough that I could begin to write my own software to run on the machine. By necessity, the content is a bit more technical, but hopefully not too complex.
For now though, an explanation is needed.
I’d mentioned in Part Four about using z80dasm as a disassembler. It does the job of taking the ones and zeros in the EPROM and converting them to a set of machine instructions that’s intelligible to human beings.
Unfortunately, there are some things that are going to slow us down. When the developers were working on creating this stuff, they had symbol tables which mapped function and variable names to addresses. These get lost along the way, since the Z80 doesn’t care about symbols, it just needs some bytes for an address and it’s golden. But the lack of a symbol table makes it harder for us humans to keep track of what’s going on. For example, there is far more meaning to
jsr print than
jsr l0a45h. The first one it’s clear that the code is calling the printing subroutine, the second one… not so much. In the case of
l0a45h, the disassembler has simply taken the hexadecimal memory address it’s referring to, and stuck an
l on the front. That’s how z80dasm works.
So I realized that I couldn’t just use z8dasm, I needed to build some stuff around it. As I got on this path a bit further, I had some other ideas to add:
- I could automatically scan the ROM for strings and add symbols for them, saving me some time
- It would be useful to add persistent comments to the code which weren’t removed every time I ran the disassembler afresh.
Okay. Here’s that code that we saw a while back. Let’s start at the top.
OMG WHAT DOES THIS EVEN MEAN!?
Sorry. You know, it’s really not that bad.
The first column optionally contains a label, like
l0007h. The disassembler helpfully places labels where it sees something being referenced. So like if the code jumps to the instruction at that point of the code, or uses some data stored there.
The second column contains the Z80 assembly instruction. This is a (semi) human readable mnemonic representing the ones and zeroes that the processor executes. Alternatively, it might begin with
defw, in which case we know that it’s just data being stored there, not an instruction.
Everything after that is separated from the first two columns by a semi-colon. The semi-colon says “the rest of this line is a comment”. So the stuff that follows isn’t needed by the machine, but it’s useful to us in understanding what’s going on.
The third column contains the address of this line’s instruction or data. This is where the particular piece of information lives on the EPROM. You’ll see that (generally speaking), numbers of all types are expressed in hexadecimal rather than decimal, which has a number of advantages after you get over how weird it looks.
The fourth column and beyond can look chaotic, but it’s basically the actual bytes stored in this block that are the numerical equivalent of the instruction. Firstly as hexadecimal values, and then ASCII characters. Some Z80 instructions are encoded with one byte, others can take two or even three. So the length of stuff that’s stored here will vary.
If you made it this far… congratulations! Although you might not know what the actual instructions do, you can now read Z80 assembler!
By way of background, most of what we tell a microprocessor to do in these instructions will fall into a few categories. For example:
- Load a thing with a value, or content of another thing. A thing might be a register, a location in memory, or an input-output device. These are known as load (
ld) instructions generally, and
outinstructions for… input and output.
- Perform some arithmetic operation on one or two things, or a thing and a value. This stuff is done by the arithmetic/logic unit, or ALU, inside the microprocessor. Why one or two? Operations like decrement (
dec) will implictly subtract one from a thing, but when we add (
add) twenty, for example, we need to give the machine that value 20 as an operand explicitly as well as the addition operator. There are a bunch of this type of instruction.
- Make a decision on where to go based on some condition like jumping relative to current address if the outcome of the last instruction is a zero (
jr z, l0000h). There’s also
jpwhich jumps to an absolute location in memory. It uses an extra byte of memory to do this but means you can jump anywhere. The general form is
jr (condition), (location), or
jp (condition), (location). There’s also a compare instruction (
cp) which is one of the ways to set up the criteria for jumping or not.
- Jump to a new location in memory without discussion (
- Jump to a subroutine (
jsr) to run some code, then come back when you’re done and have executed a return (
That’s about it! There are a few weird instructions like
ei and super weird instructions like
ldir that we’ll see later, but this should give you a good start.
Okay, but when you say “Load a thing with a value, or content of another thing”… what’s a thing?
To be fair, it’s a bit of a random name that I used. By thing, I mean either a register on the Z80, or a location in memory, or a combination of the two.
Fine, but what are registers?
Registers are like tiny pieces of memory on the microprocessor’s silicon that can keep track of one value. Pretty much all microprocessors have registers. The Z80 has registers called,
L. Each one of these can hold an 8-bit value between 0 and 255. What’s a bit more special about the Z80 is that most pairs of registers can be ‘glued’ together to make a 16-bit value:
B joins with
D joins with
H joins with
L¹. Better for math, and better for referring to any location in the while 16-bit address space. There are some other registers that we’ll get to later, but these are the ones that are used most frequently.
So. Back to things. In Z80 assembler, we represent a location in memory by wrapping it in parenthesis. Let’s look at some examples.
ld a, 100; load register
awith a value of 100 (one register, so 8 bits)
ld bc, 0ffffh; load register pair
bcwith the value hexadecimal FFFF (two registers, so 16 bits)
ld d, (02000h); load register
dwith the contents of memory location 2000 hexadecimal
ld hl, 02000h; load register pair
hlwith 2000 hexadecimal
ld a, (hl); load register a with the contents of the memory location pointed to by the
The final example is a bit more complicated, but it’s important since it allows us to change the memory location as we go, which is incredibly useful in loops and subroutines.
Okay, so given an understanding of how the disassembler lays out information, and a cursory overview of instructions and registers, what can you figure out from that disassembly dump at the beginning of the article?
I’ll talk through what this (and other) pieces of code do next time…
There are a handful of resources out there if you want to get deeper into understanding the Z80 instruction set. Just recently (and long after figuring a lot of this out from first principles), I learned about Programming the Z80 by Rodnay Zaks which looks like a great free resource to get started in learning the principles of this era of machinery.
If you want to know more about how the silicon works, then Ken Shirriff has done a fantastic job at reverse engineering on his blog, like this article.
HL and not
FG to follow on from
F is used to keep track of machine state flags, so
HL is the next free pair of letters in the alphabet.