ARM
Data processing
Data Moving
Similar to amd64, the mov instruction can be used. However, literal values must be prefixed with the # symbol!
mov x1, #0x1337aarch64 registers are 64 bits in size, but the mov instruction only works with 16 bit immediate values.
In order to move larger literal values, the mov and movk instructions are needed.
movk loads a value into the destination register with a specific bitshift, retaining all other bytes.
mov x1, #0xbeefmovk x1, #0xdead, lsl #16Results in x1 containing the value 0xdeadbeef.
Load / Store
Memory addresses cannot be directly accessed in aarch64. Only registers can be operated on.
Values must be loaded from memory to a register with ldr and written back to memory via str.
For example, to increment a value located at memory address 0x1337, the following instructions would be needed:
mov x1, #0x1337ldr x0, [x1]add x0, x0, #1str x0, [x1]Locations memory addresses can also be offset from. Example:
mov x1, #0x4000ldr x0, [x1, #8]Would load 8 bytes stored at 0x4008 into x0.
Consecutive memory addresses can be loaded and stored in a single instruction as a pair:
ldp x0, x1, [sp]stp x0, x1, [sp]Above is equivalent to the following instructions:
ldr x0, [sp]ldr x1, [sp, #8]str x0, [sp]str x1, [sp, #8]Stack
aarch64 does not have the push/pop instructions to work with the stack, instead, you must use ldr and str to retrieve values from the stack.
Fortunately, both ldr and str have the ability to increment the address passed in pre/post access.
This feature can be used to perform the same action:
Popping the stack would be of the form:
ldr x1, [sp], #16This loads the value located at the stack pointer into register x1 and then adds 16 to the stack.
Pushing to the stack would be of the form:
str x1, [sp, #-16]!This subtracts 16 from the stack pointer and then stores the value in x1 at sp.
NOTEIn aarch64, the stack pointer must be 16 byte aligned! Accessing the stack pointer when it is not properly aligned will result in a fault!
There is different syntax for accessing memory at an offset, pre-indexing, and post-indexing. All of these forms are used extensively in aarch64.
Arithmetic Instructions
Arithmetic instructions take three arguments:
add x0, x1, x2This is equivalent to x0 = x1 + x2.
madd x0, x0, x1, x2This is equivalent to x0 = x2 + (x0 * x1).
Modulo in aarch64 cannot be done in a single instruction.
r = a % b is equivalent to:
q = a / br = a - q * bFor example, calculate x0 = x0 % x1:
sdiv x2, x0, x1msub x0, x2, x1, x0Branch Instructions
Loops can be created using conditional branch instructions.
The branch instruction in aarch64 is b.
To conditionally branch a dot suffix (ex: .gt) is appended resulting in b.gt. This would be equivalent to jg in amd64.
Functions
Function calls in aarch64 are done with the branch and link instruction bl.
The functions return value is stored in register x0.
The bl instruction:
- does a PC relative jump the specified location
- and stores the return address in the link register
lr(akax30)
It is the caller’s responsibility to store the existing lr value frame pointer and any needed values in x0 - x15.
Registers x16 - x18 will be discussed later.
Registers x19 - x28 are callee saved.
The saved return address is stored in a special link register lr (aka x30).
The saved frame pointer is stored in a special frame register fr (aka x29).
Given the role of x29 and x30, it is common to see a function prologue similar to:
stp x29, x30, [sp, #-48]!mov x29, spHere, the stack pointer is decremented to create a function frame and lr and fr are stored on the stack. The last instruction shown sets the frame pointer.
Note that the stack pointer and frame pointer are equal in this case. Local variables are stored ABOVE the frame pointer. The stack pointer may decrement further when passing arguments via the stack or for dynamic stack allocations (alloca).
Similarly, a function epilogue consists of:
ldp x29, x30, [sp], #48retWhich restores lr, fr and the stack before returning.
Example 1
Function form calc_avg(ptr, count)
where:
ptris the start of the arraycountis the number of 64 bit numbers in the array
mov x0, ptrmov x1, 64bl calc_avg
calc_avg: stp x29, x30, [sp, #-0x10]! mov x29, sp mov x2, xzr // sum = 0 mov x3, x1 // original_count = countloop: cbz x1, done ldr x4, [x0], #0x8 add x2, x2, x4 subs x1, x1, #0x1 b.ne loopdone: sdiv x0, x2, x3 mov sp, x29 ldp x29, x30, [sp], #0x10 retExample 2
Function form fib(pos)
where:
posis position in the fibonacci sequence
// fib(0) = 0// fib(1) = 1// fib(n) = fib(n-1) + fib(n-2)
fib: stp x29, x30, [sp, #-0x10]! mov x29, sp cbz x0, .ret0 // if pos == 0, return 0 cmp x0, #0x1 b.eq .ret1 // if pos == 1, return 1
mov x1, #0x0 // prev = 0x0 mov x2, #0x1 // curr = 0x1 mov x3, x0 // counter = pos.loop: add x4, x1, x2 // next = prev + curr mov x1, x2 // prev = curr mov x2, x4 // curr = next subs x3, x3, #0x1 cmp x3, #0x1 b.gt .loop
mov x0, x2 // return curr mov sp, x29 ldp x29, x30, [sp], #0x10 ret.ret0: mov x0, xzr mov sp, x29 ldp x29, x30, [sp], #0x10 ret.ret1: mov x0, #0x1 mov sp, x29 ldp x29, x30, [sp], #0x10 retReferences
MIPS
TODO