How to create an OS from scratch

github.com

227 points by pykello 2 days ago


exDM69 - 2 days ago

Having built a few bare metal and hobby OS projects over the years, I would not recommend the path taken in this tutorial series.

Because if you want to write an OS, don't write a bootloader first. This article essentially describes a stage 1 bootloader on top of legacy BIOS firmware. It will teach you about historical x86 minutiae, which is nothing but a hindrance if you want to understand OS concepts.

Instead you should try to get a bootable ELF (multiboot) or PE (UEFI) image written in a high level language (C, C++, Rust, Zig, etc) as soon as possible. You can boot it up in QEMU very easily (compared to a boot sector image) and get a real debugger (gdb), system monitor and all the other invaluable tooling up. This will greatly affect the velocity of your project and get to the interesting stuff faster.

Because bare metal/OS projects may be hard to write but they are even harder to debug. You will need all the help you can get.

tombert - 2 days ago

I have noticed that every distributed app I build, as it gets more concurrent, I end up reinventing a lot of operating system work. I end up rebuilding scheduling, different caching techniques, heuristics to try and preemptively add things to cache, etc.

I really should learn more about kernel design. I would probably still be a better distributed systems engineer if I lift more concepts from OS design.

markus_zhang - 2 days ago

On a related topic - has anyone tried to move the build process of a very old Linux version, say Linux 0.92/0.12 to modern toolchains on modern computers? I believe the original build process requires gcc 1.4.0 and other programs such as `as` which are not available on modern systems.

The target is to be able to build Linux 0.12 using modern gcc and run it in QEMU, or preferable on a true 80386 machine. AFAIK, modern gcc still supports this architecture, so in concept this should be possible. There might be a LOT of changes in source code to be made, though.

The idea behind is to obtain a very old Linux that can be built easily on modern systems, with modern C support, and gradually add my own stuffs using the same tools, so that I don't limit myself to very old toolchains.

Edit: The reason to pick Linux 0.12/0.92 is because 1) It is kinda complete (without network I believe) but not too complex (I believe the loc is under 20K), and 2) It is not a toy OS and can run on real hardware (a 80386 machine), and 3) we have an excellent book for it: https://download.oldlinux.org/CLK-5.0-WithCover.pdf

senko - 2 days ago

Creating an OS is fun. It's the drivers and hardware support in general that get you. Thankless grind without which you get nowhere.

notorandit - 2 days ago

And ...

> Hey! This is an old, abandoned project, with both technical and design issues. Please have fun with this tutorial but do look for more modern and authoritative sources if you want to learn about OS design.

andsoitis - 2 days ago

Also see https://wiki.osdev.org/Getting_Started

codazoda - 2 days ago

The author dismisses this as out of date, but this is one of the most straightforward examples I’ve seen. At least reading 00 and 01.

rkagerer - 2 days ago

The crucial external document that lesson #3 references (and is all about) is now a dead link. Here's an archive - page 14 is where they point you:

https://web.archive.org/web/20241112015613if_/https://www.cs...

But here's a more succinct explanation of the reason for the memory offset: https://stackoverflow.com/a/51996005

deater - 2 days ago

as someone who has written my own OS from scratch (vmwOS) and teach a class on it, I have to agree with a lot of the other comments that x86-based OS projects do end up being exercises in 40-year old PC/x86 retrocomputing.

A few years ago I would have recommended the path I took (writing an OS for the Raspberry Pi) but the Pis have gone off the rails recently. So writing a simple OS for a Pi-1B+ is relatively doable (simple enough, sort of OK documentation, biggest downside is needing USB for the keyboard).

Things led to disaster once everyone wanted to use Pi4 (which was all we could manage to source during the CPU shortage of '23) as the documentation is poor, getting interrupts going became nearly impossible, and the virtual memory/cache/etc setup on the 64-bit cores (at least a few years ago) was not documented well at all.

owenpalmer - 2 days ago

from the readme:

> College is hard so I don't remember most of it.

Interesting how counter-productive high stress environments can be for ingraining knowledge.

EFreethought - 2 days ago

If you wish to create an OS from scratch, you must first invent the universe.

forbiddenvoid - 2 days ago

Every time one of these pops up on HN, it's always an abandoned, unfinished project. Why do these OS projects never get completed?

ljsprague - 2 days ago

I still don't understand what's "running" the boot sector.

notorandit - 2 days ago

x86 only, unfortunately. But some parts can be borrowed for other ISAs.

p0w3n3d - 2 days ago

I wonder - does the GUI really belong to the OS? In all the examples I can give, the GUI is an application that runs on top of kernel and hosts the graphics calls. However all the systems I know have the non-gui applications which can run before.

So I'd say that GUI is not a part of the OS... Please tell if you agree or not

sim7c00 - 2 days ago

First off, this _is_ a nice tutorial for what it is. It goes a little further than a lot of similar ones and its easy to follow along. Wanna say that clearly before:

I’ve always found it curious that most OS dev tutorials still focus on x86_32, even though nearly all CPUs today are 64-bit. It’s probably because older materials are easier to follow, but starting with x86_64 is more relevant and often cleaner.

Yes, 64-bit requires paging from the start, which adds complexity early on. But you can skip the whole 32-bit protected mode setup and jump straight from real mode to long mode, or better yet, use UEFI. UEFI lets you boot directly into a 64-bit context, and it's much easier to work with than BIOS once you get the hang of it. You’ll learn more by writing your own boot code than copying legacy snippets. UEFI support is straightforward, and you can still support BIOS if needed (most old machines are x64 anyway? it's been around for ages now...). Since the ESP partition is FAT32, you can place it early on disk and still leave room for a legacy bootsector if you want dual support. You can even run different payloads depending on the boot method. EDK2 has a learning curve, but once you understand how to resolve protocols and call services, it’s a huge upgrade. Most of it can be written in plain C. I only use a small inline asm trampoline to set up the stack and jump to stage 2. Also, skip legacy stuff like PIC/PIT. They’re emulated now. Use LAPIC for interrupts and timers, and look into MSI/MSI-X for modern interrupt routing. One thing I often see missing in older tutorials is a partition table in the bootsector. Without it, AHCI in QEMU won’t detect your disk atleast on some versions, and this again shows how crumbly and differently implemented some things can be (the ahci nor sata specs require this, so it's a tricky one if it hits you :D...). It’s easy to add, and makes sense if you want multiple partitions. UEFI helps here too—you can build a disk image with real partitions (e.g., FAT32 for boot, ext2/4 for root) and implement them properly. If you don't take into account your system will be using partitions and filesystems within those partitions it's gonna be a lot of re-writing.

Structure of the repo also matters. A clean layout makes it easier to grow your OS without it turning into a mess. This project seems to suggest / imply a nice structure from what I can tell, despite it's also ofcourse modelled around the tutorial itself. - thinking of architecture independence is also often forgotten. Especially when working in newer langauges like Rust that might be apealing ,but most C/C++ code can also be easily made portable if you put foresight into your OS running on different platforms. QEMU can emulate a lot of them for you to test things on.

TL;DR: Most tutorials teach you how to build an OS for 1990s hardware. Fun, but not very useful today. If you want to learn how modern OSes work, start with modern protocols and hardware. Some are more complex, but many are easier to use and better documented which can actually speed up development and reduce the need to port higher level systems over to newer low level code once you decide you want it to run on something less than 20 years old. (Athlon64 was 2003!)

noone_youknow - 2 days ago

While I agree this might be a fun resource and useful example code for various aspects of legacy x86 interfacing, I would urge anyone who hopes to actually get into OS development to ignore this (and in fact every other tutorial I’ve ever seen, including those hosted on the popular sites).

For all the reasons stated in the link from the README [1] and agreed by the author, this project should not be followed if one wants to gain an understanding of the design and implementation of operating systems for modern systems. Following it will likely lead only to another abandoned “hello world plus shell” that runs only in emulation of decades old hardware.

My advice is get the datasheets and programmers’ manuals (which are largely free) and use those to find ways to implement your own ideas.

[1] https://github.com/cfenollosa/os-tutorial/issues/269

KingLancelot - 2 days ago

[dead]