Twenty Three Hundred
Dr Charles Martin
Semester 1, 2022
What are elements of Structured Programming?
How does that stuff translate into assembly code?
control flow is about conditional execution
x < 13
x == 4
x != -3 && y > x
length(list) < 128
These all evaluate to a boolean True or False (depending on the value of the variables)
How might you express:
>
(greater than)
==
(equals)
!=
(not equals)
<=
(less than or equal to)
<c> |
meaning | flags |
---|---|---|
eq | equal | Z=1 |
ne | not equal | Z=0 |
cs | carry set | C=1 |
cc | carry clear | C=0 |
mi | minus/negative | N=1 |
pl | plus/positive | N=0 |
vs | overflow set | V=1 |
vc | overflow clear | V=0 |
hi | unsigned higher | C=1 ∧ Z=0 |
ls | unsigned lower or same | C=0 ∨ Z=1 |
ge | signed greater or equal | N=V |
lt | signed less | N≠V |
gt | signed greater | Z=0 ∧ N=V |
le | signed less or equal | Z=1 ∨ N≠V |
if (x == -24)
@ assume x is in r0
adds r1, r0, 24
beq then
In words:
then
labelif (x > 10)
@ assume x is in r0
subs r1, r0, 10
bgt then
In words:
then
assume x is in r0
cmp r0, 10
bgt then
mov r1, 10
cmp r1, r0
bmi then
mov r1, 11
cmp r0, r1 @ note the opposite order of r0, r1
bge then
are there others?
which is the best?
You need to get to know the different condition codes:
It’s hard at first, but you get the hang of it. Practice, practice, practice!
if (register1 == register2) {
register3 = 1;
} else {
register3 = 0;
}
register3 := if register1 == register2 then 1 else 0;
if register1 == register2:
register3 = 1
else:
register3 = 0
register3 = register1 == register2 ? 1 : 0
Same structure, different syntax.
All of these have:
if
)if
)then
)else
)How do these look in assembly?
if:
@ set flags here
b<c> then
then:
@ instruction(s) here
else:
@ instruction(s) here
rest_of_program:
@ continue on...
What are the problems with this? (there are a few!)
if:
@ set flags here
b<c> then
then:
@ instruction(s) here
else:
@ instruction(s) here
rest_of_program:
@ continue on...
if:
@ set flags here
b<c> then
b else @ this wasn't here before
then:
@ instruction(s) here
b rest_of_program
else:
@ instruction(s) here
rest_of_program:
@ continue on...
if:
@ set flags here
b<c> then
@ else label isn't necessary
else:
@ instruction(s) here
b rest_of_program
then:
@ instruction(s) here
rest_of_program:
@ continue on...
if:
@ x is in r0
cmp r0, 0
blt then
else:
@ don't need to do anything!
b rest_of_program
then:
mov r1, -1
mul r0, r0, r1
rest_of_program:
@ "result" is in r0
@ continue on...
Labels must be unique, so you can’t have more than one then
label in your file
So if you want more than one if statement in your program, you need
if_1
then_1
else_1
while register1 < 100 loop
register1 := register1 ** 2;
end loop;
while (register1 < 100) {
register1 = register1 * register1;
}
while register1 < 100:
register1 = register1 ** 2
if
)if
)Remember that the while loop checks the condition and then runs (not run then check).
begin_while:
@ set flags here
b<c> while_loop
b rest_of_program
while_loop:
@ loop body
b begin_while
rest_of_program:
@ continue on...
while (x != 5)
while(x != 5){
x = x / 2;
}
begin_while:
cmp r0, 5
bne while_loop
b rest_of_program
while_loop:
asr r0, r0, 1
b begin_while
rest_of_program:
@ continue on...
begin_while:
cmp r0, 5
@ "invert" the conditional check
beq rest_of_program
asr r0, r0, 1
b begin_while
rest_of_program:
@ continue on...
!=
) test, but the assembly used a branch if equal (beq
) instructioncmp
instruction to set flags without changing the values in
registersfor register1 in 1..100 loop
register3 := register3 + register1;
end loop
for (register1 = 1; register1 <= 100; register1++) {
register3 += register1;
}
for register1 in range(1, 101):
register3 += register1
for register1 in 1..100 do
register3 += register1;
What are the components?
How do these look in assembly?
begin_for:
@ init "index" register (e.g. i)
loop:
@ set flags here
b<c> rest_of_program
@ loop body
@ update "index" register (e.g. i++)
b loop
rest_of_program:
@ continue on...
// sum all the odd numbers < 10
int oddsum = 0;
for (int i = 0; i < 10; ++i) {
if(i % 2 == 1){
oddsum = oddsum + i;
}
}
mov r0, 0 @ oddsum
mov r1, 0 @ i (index)
for:
cmp r1, #10 @ expression
bge exit_for @ boolean test: if i >= 10, exit loop
@ loop body, need to test if i is odd
tst r1, #1 @ tests if bit 0 is set i.e., i is odd
beq not_odd @ test if NOT odd, then exit if
@ then: is odd
add r0, r0, r1
not_odd: @ else: not odd
add r1, #1 @ increment index: i = i + 1
b for @ go back to top of for loop
exit_for:
do while
instead of just while
break
, continue
)But in assembly language they all share the basic features we’ve looked at here
You need to be confident at writing control structures in assembly! This is core knowledge.
Goal: write a program to SHOUT any string
Have you noticed that there are <c>
bits on lots of instructions on the cheat sheet?
What happens if you try addeq r1, r1, #1
?
Error: thumb conditional instruction
should be in IT block -- `addeq r1,r1,#1'
Remember that the Thumb-2 ISA is a compromise between 16bit Thumb and 32bit ARM ISAs. Some things (e.g., conditions on every instruction) just don’t fit in 16 bits!
IT
blocks cleverly use 8 bits in the xPSR to store a plan for an if-then-else statement that can have up to four instructions.
You have to say what the condition is (here EQ), and which instructions are going to be “thens” or “elses”.
The first instruction following the IT
instruction is always a “then”.
cmp r0, 42
IT EQ
addeq r1, r1, #1
You can add up to three T
s (thens) or E
s (elses) after the IT
, e.g., here’s an if-then-else.
cmp r0, 42
ITE EQ
addeq r1, r1, #1
subne r1, r1, #1
Saves some space if you’re only doing a few instructions!
Have a look at A7.3 in the ARMv7-M Architecture Ref Manual or here for more information
“But where in memory does it go?”
As we saw last week the lowest (in terms of memory addresses) part of the address space is for instructions/code
The SRAM is the next lowest—how do we put stuff in there?
As well as instructions (e.g. mov
, mul
), there are certain
assembler
directives
where the assembler doesn’t do any “encoding”—it just plonks the value in to
the instruction stream as-is
Each of these directives allows you to insert multiple values, one-after-the-other:
.byte 1, 5, 0xf2, 0b110100 @ 4 bytes total
.hword 0, 0, 0x1234 @ 3x2=6 bytes
.word 0xdeadbeef, 0x5 @ 2x4=8 bytes
Recall that ldr
/str
require a memory address to load/store to
ldr r0, [r1] @ r1 holds the memory address
There are also “offset” versions of these instructions:
@ address in r1, load value at address+4
ldr r0, [r1, 4]
@ address in r1, store value to address-4
str r0, [r1, -4]
it’s all on the cheat sheet
When might these “load/store with offset” versions of the ldr
/str
instructions be useful? Think of as many scenarios as you can!
What will this program do? Hint: which address does the pc
register “point
to”?
main:
ldr r0, [pc, 4]
b main
.align 2
beefword:
.word 0xdeadbeef
ldr=
pseudo-instructionStoring little bits of data in the instruction stream is
such a useful trick that the assembler provides
a special syntax for it (note the =
sign before the value):
ldr r2, =0xdeadbeef
It’s called a
pseudo-instruction
because the assembler might actually produce a different instruction (e.g. a
mov
instead of an ldr
)
0xDEADBEEF
?There are a bunch of numeric literal values which are often used in systems
programming, e.g. 0xDEADBEEF
, 0x8BADF00D
(used on iOS)
Wikipedia has a list of them if you’re interested
But there’s nothing special about them (from the microbit’s perspective)
This is used all the time to load the value of a label (which is just a memory address) into a register (so you can load or store to that address)
This instruction loads it’s own address into r0
(how meta!)
loop:
ldr r0, =loop
We need to be careful about these words (code and data), because there’s no difference between them from the microbit’s point of view
.hword
what will it get encoded to (0s and 1s)
where in memory (i.e. at which addresses) will those 0s and 1s live when the program is running?
.data
sectionAll of this stuff still only affects what goes in the code section—how do we put stuff in SRAM?
We use the .data
assembler directive (and a label for keeping track of
the memory address)
ldr r0, =stuff @ load address of stuff into r0
ldr r1, [r0]
@ more code here...
.data @ from here on, everything goes in the data section
stuff:
.word 0xdeadbeef
What will be in r0
after the second line of the program has been executed?
ldr r0, =stuff @ load address of stuff into r0
ldr r1, [r0]
@ more code here...
.data
stuff:
.word 0xdeadbeef
0x20000000
) using a .data
sectionthe extra stuff in the startup file (e.g. LoopCopyDataInit
) is important
here (try deleting it and re-running the program)
This is necessary because the microbit doesn’t let you write to any addresses in the code section
You can organise the sections in your source .S
file however you like, e.g.,
.text
@ anything here is code
@ ...
.data
@ anything here will go in SRAM
@ ...
.text
@ back to code
@ ...
.text
means “code” (it’s also the default section)
the linker file makes sure everything gets put into the right place in the memory space
Chapter 2: “Instructions: Language of the Computer”