Week 12: DIY operating system
It’s the last week, but this is a very important lab with plenty of extension stuff at the end. This week’s work pulls together key concepts from our work on interrupts, operating systems, and CPU architecture. Don’t skip it!
Outline
Before you attend this week’s lab, make sure:
-
you understand how stacks work
-
you can write & enable an interrupt handler function
In this week’s lab you will:
-
explore (and exploit) the way the NVIC saves & restores register values when an interrupt handler is executed
-
construct the stack for a new process, then (manually) switch the stack pointer and watch the microbit execute that process
-
use multiple stacks to create your own multi-tasking operating system!
Introduction
Today you’ll write your own operating system—you can call it yournameOS (feel free to insert your own name in there). At the beginning of this course the possibility of writing your own OS may have seemed pretty far away, but you’ve now got all the tools to write a (basic) multitasking OS. This lab brings together all the things you’ve learned in this course, especially if you have a crack at some of the extension challenges.
Discuss with a colleague in your lab: how is it that your computer can do
heaps of things at once (check emails, have multiple programs and browser
tabs open, check for OS updates, idle on Steam, etc.)? Is there just a giant
main
loop which does all those things one-at-a-time? Or is there some other
way to achieve this?
The basic idea of today’s lab is this: instead of just using the default stack
(i.e. leaving the stack pointer sp
pointing where it did at startup) you’ll
set up and use multiple different stacks. As you’ll see, a stack is all you
need to preserve the context for a process—an independent
sequence of execution—and switching between processes is as simple as
changing the stack pointer sp
to point to a different process’s stack. The
interrupt hardware (i.e. the NVIC) even does a bunch of this work for you.
Plug in your microbit, fork & clone the lab template and let’s get started.
Task 1: anatomy of an interrupt handler stack frame
In the first task it’s time to have a close look at how the current execution context is preserved on the stack when an interrupt is triggered.
Using a simple delay
loop and the the usual helper functions in led.S
,
modify your program so that after main
it enters an infinite loop
which blinks one LED on and off at a frequency of about 1Hz. The exact
numbers aren’t important in this task, so pick some timing values which seem
about right to you.
When the ledblink
loop is running, pause the execution using the debugger and
have a look at the various register values—lr
, pc
, sp
r0
-r3
—you
should be starting to get a feel for the numbers you’ll see in each one. These
values make up the execution context—the “world” that the CPU sees when your
program (i.e., your ledblink
loop) is running.
Then, enable and configure the SysTick timer to trigger an interrupt every
millisecond. There’s a big comment (starting at main.S
line 12 in the
template repo) giving you some hints—you just need
to write the bits to the correct memory addresses. When figuring out the value
for the reload value register (SYST_RVR
) remember that your board runs at 64MHz
on startup.
Once that’s working, you should be able to set and trigger a breakpoint in the
“do-nothing” SysTick_Handler
at the bottom of main.s
1. When
this breakpoint is triggered, use the memory view to poke around on the
stack—remember that sp
points to the “top” of the stack, and the rest of the
stack is at higher memory addresses than sp
(which will appear below the
sp
memory cell on the screen in the Memory Browser because the addresses are
ordered from lower addresses at the top to higher addresses at the bottom). Can
you see any values which look similar to the values you saw when you were
looking around the execution context earlier?
Here’s what’s happening: when the SysTick interrupt is triggered, as well as
switching the currently-executing instruction to the SysTick_Handler
function, the NVIC also saves the context state onto the stack2,
so that the stack before & after the interrupt looks something like this
(obviously the actual values in memory will be different, but it’s the position
of each value on the stack that’s the important part):
Don’t be fooled by the register names (e.g. lr
or xpsr
) alongside the values
in the stack. While the interrupt handler (in this case SysTick_Handler
, but
it’s the same for all interrupts) is running, that context isn’t in the
registers, it’s “frozen” on the stack. When the handler returns (with bx lr
,
as usual) this context is popped off the stack and back into the registers and
the CPU picks up where it left off before.
Discuss with your imaginary/real neighbour—how does the program know to do all this context save/restore stuff when it returns from the interrupt handler? Why doesn’t it just jump back to where it came from like a normal function?
You might have noticed a slightly weird value in the link register lr
:
0xFFFFFFF9
. You might have thought “that doesn’t look like any return value
I’ve seen before—they usually look like 0x8000cce
or 0x80002a0
”. Well, the
trick is that the value 0xFFFFFFF9
3 isn’t an regular
location/label in the code part of your program, it’s a special exception
return value. When the CPU sees this value in the target register in a bx
instruction then it does the whole “pop the values off the stack (including the
new pc
) and execute from there” thing.
Commit & push your “empty SysTick handler” program to GitLab. That’s all you need to do for Task 1, it’s just laying the groundwork for what’s to come.
Task 2: a handcrafted context switch
Using a carefully-prepared stack, is it possible to call your ledblink
function without calling it directly using a bl
instruction?
The answer is yes, and that’s what you’re going to do in Task 2. Disable (or just don’t enable) your SysTick interrupt—you won’t be needing it in this task.
Again, the key takeaway from Task 1 is that the context (the “world” of the current process’s execution) can be “frozen” on the stack, and then at any time you can “unfreeze” the process and send it on its way by popping those values off the stack and back into the registers.
In the last task, the frozen context was placed on the stack automatically by the NVIC before the interrupt handler function was called, but in this task you’re going to hand-craft your own context stack by writing the appropriate values into memory near the stack pointer.
To do this, you’ll need a chunk of microbit memory which isn’t being used for
anything else. There are several ways you could do this, but this time let’s
just pick a high-ish address (say, 0x20008000
) in the RAM section of the
microbit’s address space.
You can get away with this since your program is the only thing running on the microbit, so if the other parts of your program leave that memory alone then you’ll be ok. On a multi-tasking OS, though, you have to share the memory space with other programs (some of which you didn’t write and you don’t know how they work) and so this assumption may not hold. There are a few ways to deal with this problem—can you think of how you might do it?
Once you’ve picked an address for your new stack pointer, you need to create the
stack frame. This can be anywhere in memory—there’s nothing special about
“stack memory”, it’s just a bunch of addresses that you read from & write to
with ldr
and str
(and friends). The memory address described above
(0x20008000
) could be any old place where there’s a bit of RAM which you’re
not using for some other purpose.
To create stack frame, write a create_process
function which:
-
loads the new stack pointer address (above) into
sp
-
decrements the stack pointer by 32 bytes (8 registers, 4 bytes per register) to make “room” for the things you need to put on the stack
-
writes the correct values on the stack (see the picture above) to represent a running
ledblink
-loop-
the status register (you can use the default value of
0x01000000
) goes at an offset of28
from your new stack pointer -
the program counter
pc
should point to the next instruction (which might be a label) to execute when the process is restored -
the link register
lr
should point to the instruction for the process to return to when it’s “done” (this doesn’t matter so much for the moment, because yourledblink
loop is infinite—it neverbx lr
s anywhere) -
put whatever values you need into the slots for
r12
and thenr3
-r0
—these are just the register values (arguments, basically) for yourledblink
process (think: do you need anything particular in here, or does it not matter for how yourledblink
loop runs?)
-
Once you’ve created the stack for your new process, write a switch_context
function to actually make the switch. This function takes one argument (the
new stack pointer) and does the opposite of step 3 above, loading the “context”
variables from the stack and putting them back into registers:
-
restore (i.e. put back) the flags into the
xpsr
register (since this is a special register you can’t justldr
into it, you have to load into a normal register liker0
first and then use the “move to special register” instruction4msr apsr_nzcvq, r0
) -
restore the rest of the registers except for
pc
-
make sure the stack pointer
sp
points to the “new” top of the stack (i.e. after theledblink
context has been popped off) -
finally, set the
ledblink
process running by restoring thepc
. Make sure that you have declaredledblink
as a function, e.g..type ledblink, %function ledblink: ...
Why can’t you restore pc
with the rest of the registers in step 2?
Write a program which creates a ledblink
stack frame “by hand” in
create_process
and then switches to this new ledblink
context using
switch_context
. When it runs, your program should blink the LED. Commit &
push your program to GitLab.
You may have noticed that the interrupt handling procedure only preserves
r0
-r3
, but not r4
-r11
. This won’t bite you if your processes don’t use
r4
-r11
, but how could you modify your switch_context
function to also
preserve the state of those registers?
Task 3: writing a scheduler
What’s the minimum amount of data (of any type) that you need to store to keep track of a process?
To turn what you’ve written so far into a fully-fledged
multitasking OS, all you need is a scheduler function which runs regularly (in
the SysTick_Handler
) and makes the context switch as appropriate.
In this task you’ll put these pieces together to create version 1 of yournameOS. yournameOS is pretty basic as far as OSes go, it only supports two concurrent processes (for v1, at least). One of them blinks one LED, and the other one blinks a different LED (but with a different blink period—time between blinks).
The bookkeeping required for keeping track of these two pointers is just three words: two stack pointers, and a value for keeping track of which process is currently executing. You can the whole process table in the data section like this (note from the difference between the stack pointer values that the OS has a maximum stack size of about 4kB):
.data
process_table:
.word 0 @ index of currently-operating process
.word 0x20008000 @ stack pointer 1
.word 0x20007000 @ stack pointer 2
The only other tricky part is to combine the “automatic” context save/restore
functionality of the interrupt handler (as you saw in Task 1) with the
“manual” context save/restore behaviour of your switch_context
function from
Task 2. You probably don’t even need a separate switch_context
function
this time, you can just do it in the SysTick_Handler
.
You can structure your program however you like, but here are a few bits of functionality you’ll need:
-
a
create_process
function which initialises the stack (like you did in the previous task) for each process you want to run -
a
SysTick_Handler
(make sure you re-enable the SysTick interrupt) which will
-
read the first entry in the process table to find out which process is currently executing
-
pick the other process and swap that stack pointer into the
sp
register (but don’t change thepc
yet!) -
update the
process_table
so that it shows the new process as executing -
trigger an interrupt return to get things moving again (make sure the handler function still exits with a
bx
to the special value0xFFFFFFF9
)
If you get stuck, remember to step through the program carefully to find out exactly what’s going wrong.
Write yournameOS version 1, including both a ledblink
and an otherblink
processes which execute concurrently. Copy the code to tasks/task-3.S
and
push it to gitlab.
The “return from interrupt” value for lr
is usually 0xFFFFFFF9
, but other
values are possible! If you look in section B1.5.8 (p.595) of the Architecture
Reference Manual, “Exception Return
Behaviour”, you’ll see that bits 0-3 in this exception return value can help
you switch between two SP
values (the Cortex M4 can actually keep track of
two stacks for you: “main” and “process”), and put the CPU into “Handler” or
“Thread” mode. How would these features help you to upgrade yournameOS?
Extra Tasks: pimp your OS
Once you’ve got your multi-process yournameOS up and running, there are several things you can try to add some polish for version 2. This task provides a few ideas—some of these are fairly simple additions to what you’ve got already, while others are quite advanced. Ask your tutor for help, read the manuals, and try to stretch yourself!
-
modify the scheduler to also save & restore the other registers (
r4
-r11
) on a context switch (as mentioned earlier) so that the processes are fully independent (currently, yournameOS v1 doesn’t preserve those registers, so if your processes are using them then the context switch will stuff things up) -
add support for an arbitrary number of processes (not just two)
-
add the ability for processes to sleep—to manually signal to the OS that they’re ready to be switched out
-
add the ability for processes to finish—to call their return address (in
lr
) and exit -
add process priorities, and a more complex scheduler which takes these priorities into account
-
add the ability to press the joystick and manually trigger a context switch, but be careful—what happens if another interrupts occurs while the scheduler function is executing?
-
advanced: use the synchronization instructions
ldrex
andstrex
to add a critical section so that each process can share a resource (e.g. a memory location) without stepping on each other’s toes (for reference, look at the Asynchronism lecture slides & recordings) -
advanced: use thread privileges & the Memory Protection Unit (Section B3.5 in the ARM reference manual) to ensure that each process can only read & write to its own (independent) sub-region of the microbit’s memory.
-
Galaxy Brain: build a VGA connector & driver using the GPIO pins, then port Doom to the microbit
Whatever you made for your extension task, push it up to GitLab with a short note for future-you to remind yourself what you actually did. Don’t forget to also write a suitably self-congratulatory commit message. Well done, you!
Coda
Folks, if you’re reading this, you’ve done it. You’re at the end of Twenty-Three-Hundred and you’ve completed all the lab material. I’m proud of you, and you should be proud of yourself: I bet you didn’t think you would be writing an OS when you starting adding registers in lab 2. We hope that this lab was a sufficient final boss battle!
-
If you need a refresher on this stuff, the interrupt lab is probably a good place to go. ↩
-
Section B1.5.6: Exception entry behavior on p587 of the ARM reference manual ↩
-
The full set of exception return values recognised by the microbit are shown in Table B1-9 on p596 of the ARM reference manual, but for the moment the one you’ll need is thread mode, main stack pointer which corresponds to the value
0xFFFFFFF9
. ↩ -
The documentation for
msr
is in Section A7.7.82 on p323 of the ARMv7-Mreference manual, also see Table B5-2 on page 729 for the bit mask. Note that we also have anmrs
instruction, “Move to Register from Special register” (A7.7.81) that can copy bits from the APSR to a regular register. ↩