home | tech | osdev | misc | code | bookmarks (broken) | contact | README


Building an operating system incrementally

Last update to this page was in 2018-10-15.

Introduction

I've acquired the very good practice of taking notes when I face a challenging quest. I call then "quests". Learning how operating systems work is a quest for me, but there are others (like learning functional programming, USB protocol stack, etc.) and I have a diary for each one. Whenever I take a step forward, I write my progress down (maybe I make available one of them, in the future?).

This document is a similar approach to operating system learning, where you can follow my improvements. It may help you as well.

One of the problems with the teaching of operating systems is that authors (and professors) usually begin from modern architectures and with modern concepts. I'm not aware of any book (if you know some, please, let me know), that takes a historic approach: one that starts with basic 8-bit or 16-bit processors, making a very single mono task operating system and builds up a more complex one using modern approaches and that is why I'll try to do here.

As I said, I'm a newbie. I'll not try to explain everything here and I'll try to link you to references whenever possible. Also, there are probably many errors on this page so, again, if you find any, let me know.

Source code and tools

The source code of this page is available in a single tarball: osdev-inc-code.tar.gz.

This page is written with the literate programming approach. If you get its source code written in reStructuredText and run noweb on it, you can extract the source code. But I have already made it for you. They are bundled in the tarball linked above.

For code, will use the following tools:

  • nasm, for 16-bit assembly;
  • make, to compile it (either BSD make or GNU make may work);
  • /bin/sh, to run examples.
  • qemu to run examples, called by /bin/sh.

Starting from the beginning: the BIOS and the MBR

I'll not dive into what are BIOS and MBR (I've linked these words so you can go to Wikipedia to learn more). In summary, BIOS, or Basic Input/Output System, is a firmware that performs several functions when you power your computer up. It finally loads the first sector (i.e., 512 bytes) of your HDD to the memory and start executing it. This first sector is called MBR or Master Boot Record.

Of course that was on the good days UEFI and GPT (the modern way computers boot up) didn't exist yet, but that is another history. Who knows if we are going to explain UEFI and GPT in the future in this page?

For a great explanation about BIOS, MBR, UEFI and GPT and how they compare, please see AdamW page: UEFI boot: how does that actually work, then? This is mandatory reading, one of the best articles I've read about that.

One important thing Adam wrote is (by 2018-10-10):

There is no BIOS specification. BIOS is a de facto standard – it works the way it worked on actual IBM PCs, in the 1980s. That’s kind of one of the reasons UEFI exists.

One more:

The MBR is another de facto standard; basically, the very start of the disk describes the partitions on the disk in a particular format, and contains a ‘boot loader’, a very small piece of code that a BIOS firmware knows how to execute, whose job it is to boot the operating system(s).

And yet another one:

In the BIOS world, absolutely all forms of multi-booting are handled above the firmware layer. The firmware layer doesn’t really know what a bootloader is, or what an operating system is. Hell, it doesn’t know what a partition is. All it can do is run the boot loader from a disk’s MBR. You also cannot configure the boot process from outside of the firmware.

This is important because, in this text we are going to make the following decisions: We are going to make a very simple 16 bit operating system in x86 Real Mode, using BIOS calls and The bootloader and the kernel will not know about partitions., for now. Our simple bootloader will just load the kernel from a known location on the hard disk and jump to its first instruction.

Note

For a very nice 16 bit operating system written in Assembly, see MikeOS.

Before starting to develop that, we are going to make some examples to get used to the BIOS calls in Real Mode.

You remember that AdamW told us there is no BIOS specification? So we have to trust in manufacturers manuals, like the PhoenixBIOS 4.0 Programmer's Guide Version 1.0, refer shortly here by BIOS Programmer Guide.

By looking at the BIOS Programmer Guide, we see that Interrupt 10h is used for "Video Services". If register ah = 0Eh, the function int 10h performs is to write a character in teletype mode (TODO: what's the different from, say, 0Ah ("Write character at cursor?"). In register al we just store the ASCII code of character to be written.

Let's do it:

<<bios-mbr-helloworld-1/helloworld.S>>=

%define CR 0Dh     ; Carriage return
%define NL 0Ah     ; New line

mov ah, 0Eh        ; Teletype write character

; Now, print every character by storing it in al and calling int 10h.

mov al, 'H'
int 10h
mov al, 'e'
int 10h
mov al, 'l'
int 10h
mov al, 'l'
int 10h
mov al, 'o'
int 10h
mov al, CR
int 10h
mov al, NL
int 10h

mov al, 'W'
int 10h
mov al, 'o'
int 10h
mov al, 'r'
int 10h
mov al, 'l'
int 10h
mov al, 'd'
int 10h
mov al, '!'
int 10h
mov al, CR
int 10h
mov al, NL
int 10h

jmp $

times 510-($-$$) db 0

dw 0xAA55

So, the code is almost self-explanatory if you know basics of Assembly (which is not the purpose of this page. To learn some Assembly I high recommend the amazing Programming from the Ground Up, by Jonathan Barlett.

One interesting to observe is that, for jumping one line, we need to use both CR and NL (aka CRNL). According to Wikipedia, it comes from the old teletypes that made its path to MS-DOS. The IBM PC BIOS also have this convention, so if you use only CR you will see the cursor is moved to the line just below (without returning to the first column) and if you use only NL it will return to the beginning of the current line (at least, this is how it happens in my qemu environment).

So, we finally define the Makefile for this example:

<<bios-mbr-helloworld-1/Makefile>>=

TARGETS = helloworld.bin

all: ${TARGETS}

helloworld.bin: helloworld.S
        nasm -o $@ $?

.PHONY: clean
clean:
        rm -f ${TARGETS}

And write two scripts to call qemu, one for graphics mode (it will probably use SDL to draw the screen) and another without graphics mode, so it will output it to your Unix terminal (I'm supposing you are using a Unix like operating system).

<<bios-mbr-helloworld-1/run-qemu.sh>>=

#!/bin/sh
qemu-system-i386 -boot c helloworld.bin

<<bios-mbr-helloworld-1/run-qemu-nographic.sh>>=

#!/bin/sh
qemu-system-i386 -nographic -curses -boot c helloworld.bin