Lectures‎ > ‎


ARM Instruction Emulation Notes

General strategy

  • Link ARM assembly functions with the ARMemu emulator. We will directly emulate machine code in memory.
  • Initialize a representation of the ARM state: registers, cpsr, and the stack.
  • For data memory (e.g., for arrays) we will use C allocated data from the C stack or the heap.
  • We will initialize all ARM state register values to 0, this includes LR.
  • Point the emulator at the beginning of a function in memory.
  • Emulate the target function by fetching an instruction word, emulating its actions on the the ARM state
    • Instruction emulation will update registers values, possibly the cpsr, the stack, and memory
    • The PC will be updated on each instruction.
    • The PC will either be updated with PC = PC + 4 or it will be set to a branch target address
  • You will need to emulate instructions from the following categories:
    • Data processing: add, sub, mov, mul, cmp
    • Memory: ldr, str, ldrb
    • Control: b, bl, bx, beq, bne (and others)
  • Before emulating an instruction you need to identify an instruction. You will see that different instructions have different fixed bit values in specific positions to help identify an instruction. You goal is to identify instructions in order of most constrained to least constrained. A more constrained instruction has more fixed bit values, whereas a less constrained instruction has fewer fixed bit values. The BX instruction is very constrained, but the data processing instruction are not very constrained.
  • Dynamic analysis requires keeping track of the number of instructions emulated and the number of instructions executed of each type (data processing, memory, control).
  • Cache simulation requires that you model the behavior of a cache. You don't need to store actual data, but just keep track of how a cache would perform. More later.

Emulating Data Processing Instructions

Consider the format of the data processing instructions:

This single format will be used for instructions like add, sub, mov, and cmp.

Note that the only constrained bits are in 26 and 27 (00). So, you want to try to identify data processing instructions after you have identified all the other instructions.

You can ignore the cond field for our purposes because we are only going to support conditional execution on branch instructions. You are welcome to implement conditional execution more generally, but it is not a requirement.

You can see that there are some variations in Operand 2 that include Shift and Rotate. You don't need to support these fields.

However, you need to support the following two forms:

add r0, r1, r2


add r0, r1, #3

That is, all operands can be registers or the last operand can be an immediate value.

You can determine if the third operand is a register or immediate by looking at bit 25. 0 means the third operand is a register, 1 means the third operand is an immediate value.

So, for all data processing instructions you can have common code that extracts the following fields (iw[n,m] means the bits from iw from position m to position n inclusive):

immediate = iw[25]
opcode = iw[24..21]
rn = iw[19..16]
rd = iw[15..12]
rm = iw[3..0]
imm = iw[7..0]

The imm value is an 8-bit  2's complement value that you need to sign extend to 32 bits.

Once you have these fields you should set the third operand value based on the immediate bit field (0 or 1).

operand3 = immediate ? state->regs[rm] : imm

Now, you can use an if statement or a switch statement to check the different opcodes and perform the appropriate action as necessary, for example:

if (opcode == 0b0100) {
    state->regs[rd] = state->regs[rn] + operand3;

Then you need to update the PC to PC = PC + 4.

You will also update your counters for dynamic analysis and update the cache simulator.

For the cmp instruction, you need to perform a subtraction of the two operands and update the CPSR register appropriately. For bne and beq, you only need to update the Z flag of the CPSR.

In the CPSR, we are concerned with the following bits:

N (Negative) iw[31]
Z (Zero) iw[30]
C (Carry) iw[29]
V (Overflow) iw[28]

N is set to one if the result is negative. Or just N = result[31] (the most significant bit).
Z is set to one if the result is zero.
C is set to one if the unsigned result overflows (for add), or if the unsigned result would underflow, borrow (cmp, sub)
V is set to one if the signed result overflows (cannot fit into a 32 bit 2's complement number)

Emulating Memory Instructions

Now consider the emulation of ldr, str, ldrb, and strb. Here is the instruction format for memory instructions:

You can identify a memory instruction (single data transfer) by seeing if  iw[27..26] == 01. Note that the memory instructions are also not constrained much so you should identify them just before or after the data processing instructions.

Just like with the data processing instructions, you can ignore the cond field.

Here are the variations of the memory instructions that you should support in emulation:

ldr r0, [r1]
ldr r0, [r1, #4]
ldr r0, [r1, r0]

This should be enough to support all of your Project03 programs for testing. You can support additional variations if you want.

Also, just like with the data processing instructions you will need to extract common fields:

immediate = iw[25]  /* determines if third operand comes from a register or an immediate value */
updown = iw[23] /* determines if the offset is added to or substracted from the base register, only needed for ldr r0, [r1, #-4] */
byteword = iw[22] /* ldr/str or ldrb/strb */
loadstore = iw[20] /* ldr/ldrb or str/strb */
rn = iw[19..16] /* base register */
rd = iw[15..12] /* this is the source for str/strb and the destination for ldr/ldrb */
imm = iw[11..0]
rm = iw[3..0]

Next you calculate the target address.

If immediate is 0, then target = rn + rm
if immediate is 1, then target = rn +/- imm

You need to then determine if you are doing a load or store (loadstore).

Then you need to determine if you are transferring a byte or a word.

For example:

ldr r0, [r1]

unsigned in target = r1 + 0 (implicit immediate value)
rd = r0

regs[rd] = *target;

You will need to figure out how to do a store and also how to load and store a byte value.

You will need to update your instruction counts and your cache simulator.

Emulating Branch Instructions

You need to emulate the following branch instructions: bx, bl, b, beq, bne (and perhaps other conditionals).

Let's consider bx first because it is the easiest to emulate:

You can ignore cond.

Just extract rn:

rn = iw[3..0]

Now update the PC

state->regs[PC] = regs[rn];

Branch and Branch and Link require a calculation of the branch target address and conditional execution.

You need to extract the following fields:

cond = iw[31..28]
link = iw[24]
offset = iw[23..0]

First, you need to support conditional execution of the branch instructions. You need to check the cond field for the condition type. For example:

0000 is EQ, which means beq. So, you need to look at the Z bit in the CPSR (which was set by a previous cmp instruction). If cond is 0000 and Z is 1, then you need to take the branch (the steps below). Otherwise if Z  is 0, then you need to ignore the branch instruction and simply set PC to PC + 4. Note, that you still need to update your instruction counts (we count a branch not taken as executing a branch instruction) and your cache simulator.

Second, if link is 1 then you need to set LR to PC + 4 (the next instruction).

To calculate the branch target address you need to do the following:

The offset is a 2's complement word address so we need to adjust it to be a 2's complement byte address. This means sign extending the offset value and multiplying by 4. 

Then, assume that the offset is relative to PC + 8 (for architectural reasons).

This means you must adjust the target address by adding 8.

Then add the new offset to PC and set the PC to this value.