NANDHOO.

x86/x64 Kernel Development

Chapter 15: x86/x64 Kernel Development


Introduction


Building a kernel for x86/x64 architecture requires understanding CPU modes, memory management, interrupt handling, and device interaction. This chapter guides you through creating a minimal but functional x86/x64 kernel from scratch, including bootloader integration, protected/long mode setup, and essential kernel services.


Why This Matters


x86/x64 dominates desktop and server computing. Understanding x86/x64 kernel development teaches you about real-world operating system implementation. The concepts apply to professional kernel work, whether contributing to Linux, developing embedded systems, or creating custom operating systems.


How to Study This Chapter


  1. Build incrementally - Start minimal, add features one at a time
  2. Test frequently - Every change should be tested in QEMU
  3. Read Intel manuals - Authoritative reference for x86/x64
  4. Study Linux source - See how professionals do it
  5. Debug systematically - Use GDB and QEMU monitor

Project Setup


Directory Structure


kernel-project/
├── boot/
│   ├── boot.asm          # Bootloader stage 1
│   └── boot2.asm         # Bootloader stage 2
├── kernel/
│   ├── kernel.c          # Kernel main
│   ├── idt.c             # Interrupt handling
│   ├── memory.c          # Memory management
│   ├── process.c         # Process management
│   └── drivers/
│       ├── vga.c         # VGA text mode driver
│       └── keyboard.c    # Keyboard driver
├── include/
│   ├── kernel.h
│   ├── idt.h
│   └── types.h
├── linker.ld             # Linker script
└── Makefile

Build System


Makefile:

# Makefile for x64 kernel

AS = nasm CC = gcc LD = ld QEMU = qemu-system-x86_64


CFLAGS = -m64 -ffreestanding -fno-pie -fno-stack-protector -mno-red-zone
-nostdlib -Iinclude -Wall -Wextra LDFLAGS = -T linker.ld -nostdlib


BOOT_SRC = boot/boot.asm boot/boot2.asm KERNEL_SRC = (wildcardkernel/.ckernel/drivers/.c)KERNELOBJ=(wildcard kernel/*.c kernel/drivers/*.c) KERNEL_OBJ = (KERNEL_SRC:.c=.o)


.PHONY: all clean run debug


all: os.img


boot/boot.bin: boot/boot.asm (AS)fbino(AS) -f bin -o @ $<


boot/boot2.bin: boot/boot2.asm (AS)fbino(AS) -f bin -o @ $<


kernel/entry.o: kernel/entry.asm (AS)felf64o(AS) -f elf64 -o @ $<


%.o: %.c (CC)(CC) (CFLAGS) -c -o @@ <


kernel.bin: kernel/entry.o (KERNELOBJ)linker.ld(KERNEL_OBJ) linker.ld (LD) (LDFLAGS)o(LDFLAGS) -o @ kernel/entry.o $(KERNEL_OBJ)


os.img: boot/boot.bin boot/boot2.bin kernel.bin dd if=/dev/zero of=@bs=512count=2880ddif=boot/boot.binof=@ bs=512 count=2880 dd if=boot/boot.bin of=@ bs=512 count=1 conv=notrunc dd if=boot/boot2.bin of=@bs=512seek=1conv=notruncddif=kernel.binof=@ bs=512 seek=1 conv=notrunc dd if=kernel.bin of=@ bs=512 seek=9 conv=notrunc


run: os.img $(QEMU) -drive file=os.img,format=raw -serial stdio


debug: os.img $(QEMU) -drive file=os.img,format=raw -serial stdio -s -S & gdb -ex "target remote :1234" -ex "break kernel_main" -ex "continue"


clean: rm -f boot/.bin kernel/.o kernel/drivers/*.o kernel.bin os.img


Bootloader (x64)


Stage 1: MBR Bootloader


boot/boot.asm:

; Stage 1 bootloader - Loads stage 2
BITS 16
ORG 0x7C00

start: ; Set up segments xor ax, ax mov ds, ax mov es, ax mov ss, ax mov sp, 0x7C00


; Save boot drive
mov [boot_drive], dl

; Load stage 2 (8 sectors from sector 2)
mov ah, 0x02                ; Read sectors
mov al, 8                   ; 8 sectors (4 KB)
mov ch, 0                   ; Cylinder 0
mov cl, 2                   ; Start at sector 2
mov dh, 0                   ; Head 0
mov dl, [boot_drive]
mov bx, 0x7E00              ; Load at 0x7E00
int 0x13
jc disk_error

; Jump to stage 2
jmp 0x7E00

disk_error: mov si, msg_error .print: lodsb test al, al jz .hang mov ah, 0x0E int 0x10 jmp .print .hang: cli hlt jmp .hang


boot_drive: db 0 msg_error: db 'Disk error', 0


times 510-($-$$) db 0 dw 0xAA55


Stage 2: Extended Bootloader with Long Mode


boot/boot2.asm:

BITS 16
ORG 0x7E00

stage2: mov si, msg_stage2 call print


; Enable A20 line
call enable_a20

; Load kernel (32 sectors from sector 10 to 0x10000)
mov ah, 0x02
mov al, 32
mov ch, 0
mov cl, 10
mov dh, 0
mov dl, [0x7C00 + boot_drive - start]
mov bx, 0x1000
mov es, bx
xor bx, bx
int 0x13
jc disk_error

; Enter protected mode
cli
lgdt [gdt_descriptor]

mov eax, cr0
or eax, 1
mov cr0, eax

jmp 0x08:protected_mode

enable_a20: in al, 0x92 or al, 2 out 0x92, al ret


print: lodsb test al, al jz .done mov ah, 0x0E int 0x10 jmp print .done: ret


disk_error: mov si, msg_disk_error call print cli hlt


BITS 32 protected_mode: ; Set up segments mov ax, 0x10 mov ds, ax mov es, ax mov fs, ax mov gs, ax mov ss, ax mov esp, 0x90000


; Set up paging for long mode
call setup_paging

; Enable long mode
; Set PAE bit in CR4
mov eax, cr4
or eax, 1 << 5
mov cr4, eax

; Load PML4 address
mov eax, pml4
mov cr3, eax

; Enable long mode in EFER MSR
mov ecx, 0xC0000080
rdmsr
or eax, 1 << 8
wrmsr

; Enable paging
mov eax, cr0
or eax, 1 << 31
mov cr0, eax

; Jump to long mode
jmp 0x08:long_mode

setup_paging: ; Identity map first 2 MB mov edi, pml4 mov cr3, edi xor eax, eax mov ecx, 4096 rep stosd mov edi, cr3


; PML4[0] -> PDPT
mov dword [edi], pdpt + 3

; PDPT[0] -> PD
mov dword [pdpt], pd + 3

; PD[0] -> 2MB page (huge page)
mov dword [pd], 0x83  ; Present, writable, huge page

ret

BITS 64 long_mode: ; Load new GDT for long mode lgdt [gdt_descriptor]


; Set up segments
mov ax, 0x10
mov ds, ax
mov es, ax
mov fs, ax
mov gs, ax
mov ss, ax

; Set up stack
mov rsp, 0x90000

; Jump to kernel
mov rax, 0x10000
jmp rax

; Hang if kernel returns
cli
hlt

; GDT for protected and long mode gdt_start: dq 0 ; Null descriptor


gdt_code: ; Code segment dw 0xFFFF ; Limit low dw 0 ; Base low db 0 ; Base middle db 10011010b ; Access: present, ring 0, code, exec/read db 11001111b ; Flags + Limit high db 0 ; Base high


gdt_data: ; Data segment dw 0xFFFF dw 0 db 0 db 10010010b ; Access: present, ring 0, data, read/write db 11001111b db 0


gdt_end:


gdt_descriptor: dw gdt_end - gdt_start - 1 dq gdt_start


msg_stage2: db 'Stage 2', 13, 10, 0 msg_disk_error: db 'Disk error', 0


align 4096 pml4: times 512 dq 0 pdpt: times 512 dq 0 pd: times 512 dq 0


times 4096-($-$$) db 0 ; Pad to 4 KB


Kernel Entry Point


kernel/entry.asm:

BITS 64

extern kernel_main


global _start


_start: ; Set up stack mov rsp, stack_top


; Clear direction flag
cld

; Call kernel main
call kernel_main

; Hang if kernel returns
cli

.hang: hlt jmp .hang


section .bss align 16 stack_bottom: resb 16384 ; 16 KB stack stack_top:


Linker Script


linker.ld:

ENTRY(_start)

SECTIONS { . = 0x10000;


.text : {
    *(.text)
}

.rodata : {
    *(.rodata)
}

.data : {
    *(.data)
}

.bss : {
    *(.bss)
    *(COMMON)
}

}


Kernel Main


include/types.h:

#ifndef TYPES_H
#define TYPES_H

typedef unsigned char uint8_t; typedef unsigned short uint16_t; typedef unsigned int uint32_t; typedef unsigned long long uint64_t;


typedef signed char int8_t; typedef signed short int16_t; typedef signed int int32_t; typedef signed long long int64_t;


typedef uint64_t size_t; typedef uint8_t bool;


#define true 1 #define false 0 #define NULL ((void*)0)


#endif


kernel/kernel.c:

#include "types.h"

// VGA text mode #define VGA_MEMORY 0xB8000 #define VGA_WIDTH 80 #define VGA_HEIGHT 25


static uint16_t *vga_buffer = (uint16_t *)VGA_MEMORY; static int cursor_x = 0; static int cursor_y = 0;


void putchar(char c) { if (c == '\n') { cursor_x = 0; cursor_y++; } else { int offset = cursor_y * VGA_WIDTH + cursor_x; vga_buffer[offset] = (0x0F << 8) | c; cursor_x++; }


if (cursor_x >= VGA_WIDTH) {
    cursor_x = 0;
    cursor_y++;
}

if (cursor_y >= VGA_HEIGHT) {
    // Scroll
    for (int y = 1; y < VGA_HEIGHT; y++) {
        for (int x = 0; x < VGA_WIDTH; x++) {
            vga_buffer[(y - 1) * VGA_WIDTH + x] =
                vga_buffer[y * VGA_WIDTH + x];
        }
    }
    cursor_y = VGA_HEIGHT - 1;
    for (int x = 0; x < VGA_WIDTH; x++) {
        vga_buffer[cursor_y * VGA_WIDTH + x] = 0;
    }
}

}


void puts(const char *str) { while (*str) { putchar(*str++); } }


void clear_screen(void) { for (int i = 0; i < VGA_WIDTH * VGA_HEIGHT; i++) { vga_buffer[i] = 0; } cursor_x = 0; cursor_y = 0; }


// Port I/O static inline void outb(uint16_t port, uint8_t val) { asm volatile("outb %0, %1" : : "a"(val), "Nd"(port)); }


static inline uint8_t inb(uint16_t port) { uint8_t ret; asm volatile("inb %1, %0" : "=a"(ret) : "Nd"(port)); return ret; }


void kernel_main(void) { clear_screen(); puts("x64 Kernel loaded!\n"); puts("Hello from kernel space!\n");


// Hang
while (1) {
    asm("hlt");
}

}


Interrupt Handling


include/idt.h:

#ifndef IDT_H
#define IDT_H

#include "types.h"


void init_idt(void);


#endif


kernel/idt.c:

#include "idt.h"
#include "types.h"

extern void puts(const char *);


struct idt_entry { uint16_t offset_low; uint16_t selector; uint8_t ist; uint8_t type_attr; uint16_t offset_mid; uint32_t offset_high; uint32_t reserved; } attribute((packed));


struct idt_ptr { uint16_t limit; uint64_t base; } attribute((packed));


static struct idt_entry idt[256]; static struct idt_ptr idtr;


// Exception handlers (implemented in assembly) extern void isr0(void); extern void isr13(void); extern void isr14(void);


void set_idt_entry(int num, uint64_t handler) { idt[num].offset_low = handler & 0xFFFF; idt[num].selector = 0x08; // Kernel code segment idt[num].ist = 0; idt[num].type_attr = 0x8E; // Present, ring 0, interrupt gate idt[num].offset_mid = (handler >> 16) & 0xFFFF; idt[num].offset_high = (handler >> 32); idt[num].reserved = 0; }


void init_idt(void) { // Set up exception handlers set_idt_entry(0, (uint64_t)isr0); set_idt_entry(13, (uint64_t)isr13); set_idt_entry(14, (uint64_t)isr14);


// Load IDT
idtr.limit = sizeof(idt) - 1;
idtr.base = (uint64_t)&idt;
asm volatile("lidt %0" : : "m"(idtr));

// Enable interrupts
asm volatile("sti");

}


// Exception handlers (C) void divide_error_handler(void) { puts("EXCEPTION: Divide Error\n"); while (1) asm("hlt"); }


void general_protection_handler(void) { puts("EXCEPTION: General Protection Fault\n"); while (1) asm("hlt"); }


void page_fault_handler(void) { puts("EXCEPTION: Page Fault\n"); while (1) asm("hlt"); }


kernel/isr.asm (add to entry.asm or separate file):

BITS 64

extern divide_error_handler extern general_protection_handler extern page_fault_handler


global isr0 global isr13 global isr14


isr0: push rax push rbx push rcx push rdx call divide_error_handler pop rdx pop rcx pop rbx pop rax iretq


isr13: push rax push rbx push rcx push rdx call general_protection_handler pop rdx pop rcx pop rbx pop rax add rsp, 8 ; Error code iretq


isr14: push rax push rbx push rcx push rdx call page_fault_handler pop rdx pop rcx pop rbx pop rax add rsp, 8 ; Error code iretq


Memory Management


kernel/memory.c:

#include "types.h"

#define PAGE_SIZE 4096 #define MEMORY_START 0x100000 // 1 MB #define MEMORY_END 0x1000000 // 16 MB


static uint8_t *next_free = (uint8_t *)MEMORY_START;


void *kmalloc(size_t size) { // Align to 16 bytes size = (size + 15) & ~15;


if ((uint64_t)next_free + size > MEMORY_END) {
    return NULL;  // Out of memory
}

void *ptr = next_free;
next_free += size;
return ptr;

}


void kfree(void *ptr) { // Simple allocator doesn't free // Real implementation would use free list (void)ptr; }


Testing the Kernel


# Build
make clean
make

Run

make run


Debug

make debug


Expected output:

x64 Kernel loaded!
Hello from kernel space!

Adding More Features


Timer Interrupt


kernel/timer.c:

#include "types.h"

extern void outb(uint16_t, uint8_t); extern void puts(const char *);


static uint64_t tick = 0;


void timer_handler(void) { tick++;


if (tick % 100 == 0) {
    puts("Timer tick\n");
}

// Send EOI to PIC
outb(0x20, 0x20);

}


void init_timer(void) { // Set PIT frequency (100 Hz) uint32_t divisor = 1193180 / 100;


outb(0x43, 0x36);  // Command
outb(0x40, divisor & 0xFF);
outb(0x40, divisor >> 8);

}


Keyboard Driver


kernel/keyboard.c:

#include "types.h"

extern void outb(uint16_t, uint8_t); extern uint8_t inb(uint16_t); extern void putchar(char);


static const char scancode_to_char[] = { 0, 0, '1', '2', '3', '4', '5', '6', '7', '8', '9', '0', '-', '=', '\b', '\t', 'q', 'w', 'e', 'r', 't', 'y', 'u', 'i', 'o', 'p', '[', ']', '\n', 0, 'a', 's', 'd', 'f', 'g', 'h', 'j', 'k', 'l', ';', ''', '`', 0, '\', 'z', 'x', 'c', 'v', 'b', 'n', 'm', ',', '.', '/', 0, '*', 0, ' ' };


void keyboard_handler(void) { uint8_t scancode = inb(0x60);


if (scancode < sizeof(scancode_to_char)) {
    char c = scancode_to_char[scancode];
    if (c) {
        putchar(c);
    }
}

// Send EOI
outb(0x20, 0x20);

}


Key Concepts


  • Long mode is 64-bit mode on x86/x64
  • Four-level paging required for long mode
  • PML4 is top-level page table in x64
  • Huge pages map 2 MB with single entry
  • IDT handles interrupts and exceptions
  • GDT must be reloaded in long mode
  • System V ABI is calling convention for x64
  • Serial output useful for debugging

Common Mistakes


  1. Forgetting to enable PAE - Required for long mode
  2. Not identity mapping bootloader - Paging causes jumps to fail
  3. Wrong GDT in long mode - Must use 64-bit code segment
  4. Stack misalignment - x64 requires 16-byte alignment
  5. Not clearing BSS - Uninitialized data may not be zero
  6. Forgetting red zone - Use -mno-red-zone for kernel
  7. Missing EFER.LME - Long mode enable bit

Debugging Tips


  • Use serial port - Printf to serial for debugging
  • Print register dumps - Show state before crash
  • Test in stages - Verify each mode transition
  • Check page tables - Print PML4/PDPT/PD entries
  • Use GDB symbols - Compile with debug info
  • QEMU logging - Enable interrupt logging
  • Bochs - Better debugging than QEMU

Mini Exercises


  1. Add serial port driver for debugging
  2. Implement proper printf function
  3. Create physical memory allocator
  4. Set up timer interrupt (IRQ 0)
  5. Add keyboard interrupt handler (IRQ 1)
  6. Implement basic shell
  7. Create process structure
  8. Add context switching
  9. Implement simple scheduler
  10. Add system call interface

Review Questions


  1. What are the steps to enter long mode from real mode?
  2. What is the purpose of the PML4 table?
  3. How do huge pages work in x64?
  4. What's the difference between IDT and GDT?
  5. Why is the red zone disabled in kernel code?

Reference Checklist


By the end of this chapter, you should be able to:

  • Create a two-stage bootloader for x64
  • Set up paging for long mode
  • Enter long mode from protected mode
  • Set up GDT for long mode
  • Initialize IDT for exceptions and interrupts
  • Handle exceptions (divide error, GPF, page fault)
  • Write VGA text mode driver
  • Implement basic memory allocator
  • Add timer and keyboard interrupts
  • Build complete kernel with Makefile

Next Steps


Now that you've built an x86/x64 kernel, the next chapter covers ARM kernel development. You'll learn the differences in boot process, memory management, interrupt handling, and create a kernel for ARM architecture.




Key Takeaway: Building an x86/x64 kernel requires understanding long mode, four-level paging, interrupt handling, and x64 calling conventions. Start minimal and test incrementally for successful kernel development.