In this blog I share my observations, thoughts and experience about computers, linguistics, philosophy and many other things that interest me.

Friday, April 10, 2026

QRV v0.21: Interactive Shell on Real Hardware

It boots. On a real RISC-V board, pressing a real reset button, with a real serial cable connected to a real terminal. The shell prompt appears. pidin runs and shows 15 GB free out of 16. shutdown works. This is v0.21 on a SiFive HiFive Unmatched.

The full console capture:

U-Boot 2025.10-0ubuntu0.24.04.1 (Nov 19 2025 - 17:26:40 +0000)

CPU:   sifive,bullet0
Model: SiFive HiFive Unmatched A00
DRAM:  16 GiB
...
=> run bootqrv
385008 bytes read in 0 ms
3778560 bytes read in 4 ms (900.9 MiB/s)

Starting kernel ...

===============================================================================

Boot command line: -Dsbi
+------------------------------------------+
| QRV Operating System Kernel version 0.21 |
+------------------------------------------+
init_clocks
init_raminfo
init_mmu
  ram_top = 480000000
  identity_map RAM 80000000..480000000
  identity_map CLINT 2000000
  identity_map PLIC c000000
  high_va_map RAM
  high_va_map CLINT
  high_va_map PLIC
  page tables ready, root at 80a62000
  enable_mmu...
enable_mmu: Sv39 paging active
  MMU enabled
init_intrinfo: timer(base=5,n=1) plic(base=32,n=128)
init_qtime: cycles_per_sec=1000000 timer_rate=1000 intr=5
init_cpuinfo: 4 CPUs, rv64imac, 1 MHz
init_system_private
Startup complete, entering kernel...
[2] modpkg: bin/esh: executable (recorded)
[2] modpkg: boot/taskman.qkx: kernel extension (recorded)
[2] modpkg: lib/ld-qrv.so: shared library (recorded)
[2] modpkg: lib/libc.qrl: shared library (recorded)
[2] modpkg: loaded bin/esh (sv39) at VA 10000, entry=120a4
[2] modpkg: ld-qrv.so: applied 160/160 RELATIVE relocations
[2] modpkg: loaded ld-qrv.so at VA 100000, entry=101610
[2] modpkg: bin/esh is dynamic, entering via ld-qrv.so at 101610, sp=3fffffdf60
[2] modpkg: 5 module(s) found
[2] plic_init: 127 sources configured

*** QRV kernel: preparing to start taskman ***

[2] kerlink: lib/libc.qrl: ET_DYN, 8 program headers
[2] kerlink: lib/libc.qrl: exported 703 symbols for dynamic linking
[2] kerlink: boot/taskman.qkx: ET_REL, 22 sections
[2] kerlink: boot/taskman.qkx: resolved 5716/5716 external symbols (0 unresolved)
[2] kerlink: boot/taskman.qkx: applied 9954/9954 relocations (0 skipped)
[2] kerlink: boot/taskman.qkx: link complete, entry taskman_main=ffffffc0803c44c8
[2] taskman: message_init...
[2] taskman: rsrcdbmgr_init...
[2] taskman: sysmgr_init...
[2] taskman: pathmgr_init...
[2] taskman: bootimage_init...
[2] taskman: namedsem_init...
[2] taskman: message_start...
[2] cpu_start_others: hart 1 started
[2] cpu_start_others: hart 3 started
[2] cpu_start_others: hart 4 started
[2] bootimage_start: opening /dev/console
[2] bootimage: spawned /rd/init (pid 2)

***********************************
* Welcome to QRV Operating System *
***********************************

     pid tid name               prio STATE       Blocked
       1   1 taskman            255r RUNNING
       1   2 taskman            255r RUNNING
       1   3 taskman            255r RUNNING
       1   4 taskman            255r RUNNING
       1   5 taskman            255r RECEIVE     1
       1   6 taskman             10r RUNNING
       1   7 taskman            255r RECEIVE     1
       1   8 taskman            255r RECEIVE     1
       1   9 taskman             10r RECEIVE     1
       1  10 taskman             10r RECEIVE     1
       1  11 taskman             10r RECEIVE     1
       1  12 taskman             10r RECEIVE     1
       1  13 taskman             10r RECEIVE     1
       2   1 esh                 10r REPLY       1
       2   2 esh                 10r RECEIVE     1
       3   1 pidin               10r REPLY       1
       3   2 pidin               10r RECEIVE     1

Starting devc-ser8250...
[2] devc-ser8250: no UART found in hwinfo and no -p/-i given
system console set to /dev/ser1
Unable to reopen /dev/ser1

Starting devb-virtio...
devb-virtio: no virtio device in hwinfo and no -p/-i given

Starting fs-qrv...
fs-qrv: cannot open /dev/vblk0: No such file or directory

Use shutdown command to shut down the machine

# pidin info
CPU:RISC-V Release:0.21   FreeMem:15Gb/16Gb BootTime:0
Processes: 4, Threads: 18
Processor1: 1 rv64imac 1MHz FPU
Processor2: 2 rv64imac 1MHz FPU
Processor3: 3 rv64imac 1MHz FPU
Processor4: 4 rv64imac 1MHz FPU
# shutdown
Shutting down...

What the Log Shows

The Unmatched has 16 GiB of RAM across physical addresses 0x80000000–0x47fffffff. All of it is discovered, mapped, and visible to the allocator. pidin info reports FreeMem:15Gb/16Gb — the kernel and taskman together occupy about 1 GB of a 16 GB machine, which is reasonable for a debug build with no memory trimming.

Four CPUs start: hart 1, 3, and 4 (the U74 application cores — hart 0, the S7 monitor core without MMU support, is correctly skipped). Thirteen taskman threads running across them. The init script runs, spawns pidin, the process table is displayed, the shell accepts input, shutdown works.

The virtio and UART driver failures are expected: devb-virtio and devc-ser8250 currently discover their hardware through the syspage hwinfo section, which is populated from FDT nodes for virtio,mmio and ns16550a. The Unmatched's UART is a different compatible string (sifive,uart0), and there is no virtio device on real hardware. Both fail gracefully with a diagnostic. The console falls back to the in-kernel devtext driver, which is why the shell prompt is reachable at all.

These are the next targets.


What Fixed It

Three bugs stood between the first crash and a clean boot. Two were hardware-specific — invisible on QEMU, fatal on real silicon.

CPU_SYSTEM_PADDR_END = 0xFFFFFFFF. The physical allocator's ceiling was 4 GB. 14 of the Unmatched's 16 GiB were simply outside the allocatable universe. Any address in that region passed through the allocator with undefined results — function pointers truncated to 32 bits, corrupted call targets, instruction page faults. The fix is one line, committed on March 14th and dormant until today:

- #define CPU_SYSTEM_PADDR_END   0xFFFFFFFFUL
+ #define CPU_SYSTEM_PADDR_END   ((1UL << 56) - 1)

Along with it: the paddr32/paddr64 split mechanism — an x86 PAE relic that made no sense on LP64 RISC-V — was removed end-to-end. On a 64-bit architecture where every pointer is 64 bits, the per-process opt-in to high physical memory is pointless. restrict_user now covers the full 0 … last_paddr range.

PTE Accessed/Dirty bits on user mappings. The U74 uses trap-based A/D bit management: if a leaf PTE doesn't have PTE_A set (and PTE_D for writable pages), the CPU raises a page fault on first access and expects the OS to set them. QEMU manages these bits in hardware and sets them automatically, masking the bug entirely. prot_to_pte_bits() now pre-sets PTE_A on all mappings and PTE_D on writable ones.

cpiofs_lseek return value in combined messages. When the ELF loader sends a combined lseek+read message (rcvid == 0), cpiofs_lseek was returning _RESMGR_PTR instead of EOK. _RESMGR_PTR breaks the combined-message dispatch loop in _resmgr_io_handler — the read handler never runs, proc_read gets a short reply, and retries with a shifted buffer pointer. On QEMU this produced intermittent load failures; on the Unmatched it was consistent enough to block the boot entirely.


The March 14 Comment

The CPU_SYSTEM_PADDR_END constant was set on March 14th — the same week as first user-mode ecall dispatch, the same week the runtime linker first self-relocated. The comment at the time read:

QEMU virt tops out at ~4GB for now.

A temporary constraint, noted honestly, and then left in place while the system grew around it. On QEMU with 256 MB of RAM, no address ever came within twelve times the horizon. Six weeks later, on a machine with 64 times more RAM than QEMU was given, the constraint became visible.

That is the nature of this kind of porting work. The bugs that matter are not the ones that crash immediately — those are easy to find. The ones that matter are the ones that were correct for the world they were written in.


What Comes Next

UART driver support for the Unmatched's sifive,uart0 hardware, so the shell runs on the real serial driver rather than devtext. After that: block storage on real hardware, which means either NVMe through the PCIe slot the Unmatched provides, or SD card. The PCI server from QNX 6.6 is on the horizon.

No comments: