📝 Note: This post is about stuff I’m not totally sure about, talking about a topic I’m not super familiar with. If I mess up â which I most likely do â feel free to reach out and share your insights.
Context
I recently had this idea of writing my own RISC-V binary emulator/translator in Rust. Naturally, I found myself having to sharpen my RISC-V assembly skills, mainly for writing my own programs to test the emulator with.
After doing some research and reading about similar concepts online, I became curious about the huge rabbit hole that modern compilers are, and more specifically how compilers (like gcc) generate assembly. Feeling inspired and having -admittedly- a bit too much free time on my hands, I decided to find a way to compile Rust to RISC-V assembly for local running and debugging, using QEMU and the GNU Debugger.
Despite having no real experience with Rust or the RISC-V ISA, I decided to go for it, and document my journey along the way. Think of it as learning-by-doing.
Prerequisite
First things first (if you’re interested in trying this project for yourself) you’ll need the riscv-gnu-toolchain
. Instructions on how to build the toolchain are in the GitHub repo
, just note that in this blog post we’ll be using the Newlib cross-compiler, as we are targetting embedded systems and bare-metal applications that don’t rely on a full operating system like Linux.
You’ll also need to have qemu
installed, which you will either need to build it from source, or install a prebuilt binary. Either way you will probably end up with two executables: qemu-system-riscv32
and qemu-system-riscv64
💡 Tip: If you decided to built it from source, you might find this useful:
# Building QEMU to support both 32-bit and 64-bit RISC-V architectures:
make clean
./configure --target-list=riscv64-softmmu,riscv32-softmmu
make -j $(nproc)
sudo make install
Finally, since we are going to be using Rust and it’s build system, make sure that you have both of them installed and up-to-date preferably using the official toolchain manager, rustup .
Once you have a working Rust development environment, you’ll need to:
- Initialize a new cargo project and navigate there:
cargo new qemu_riscv && cd qemu_riscv
- Create the .cargo directory and navigate there:
mkdir .cargo && cd .cargo
- Create the cargo config file in order to configure cargo accordingly:
touch config
- Copy and paste this there:
[build]
target = "riscv64gc-unknown-none-elf"
The above snippet, let’s Cargo know, that we are targetting the RV64GC architecture, as per the QEMU’s virt machine docs .
When we’ll try to build for the first time, Cargo will check if the necessary target toolchain is installed. If the toolchain is not installed, Rustup will automatically download and install it for us.
You can also check wether you have the RISC-V support installed by running:
$ rustc --print target-list|grep riscv
The Fun part
Now that weâve successfully installed and set up everything we need, letâs transition to the fun part: writing some code
Printing stuff is hard.
We’ll be writing a simple “Hello, world” app in Rust and then cross-compiling it using the riscv-gnu-toolchain
to RV64GC. Usually, writing this type of project refers to printing to stdout a simple String (say “Hello, world!”).
In this case, however, things are a bit different. Doing a print('hello, world!')
simply won’t cut it. Given that we are dealing with bare-metal programming, there’s limited room for abstractions. The nature of bare-metal development demands a direct and low-level approach, leaving little space for the conveniences of higher-level abstractions.
Thus, when talking about bare-metal programming, the typical “Hello, world” type of project, revolves around transmitting or receiving messages using a special hardware device (interface) called UART .
Additionally, given that we are working without the luxuries provided by a full-blown operating system, we won’t be able to rely on any libraries, including Rust’s standard library, for linking against, during the build process. This is mainly due to their large storage footprints, which make them impractical when developing for devices with a couple of megabytes of storage.
Since we are aiming to emulate our cross compiled binary using QEMU, it’s at this point that we should check on the Documentation. Looking at QEMU’s -help
option, we find the following option:
-machine [type=]name[,prop[=value][,...]]
selects emulated machine ('-machine help' for list)
Upon execution, we get:
$ qemu-system-riscv64 -machine help
Supported machines are:
none empty machine
sifive_e RISC-V Board compatible with SiFive E SDK
sifive_u RISC-V Board compatible with SiFive U SDK
spike RISC-V Spike Board (default)
virt RISC-V VirtIO board
We’re picking the virt
[1]
option, since it’s the simplest machine provided.
Since documentation around this is far and scarce, here’s what we need to know:
- Virt represents a RISC-V VirtIO board.
- VirtIO is a standard for virtualized I/O devices, and this machine type includes support for virtualized devices using the VirtIO framework.
Device Trees and UART Configuration
Now that we know which machine we’re targeting, we need to figure out a way of finding what devices are available on our particular machine, along with their topology and configuration. What we need is our target’s Device Tree .
Device trees are a way, modern computers use, to describe the hardware of a system in a platform-independent manner.
In a device tree, devices and their properties are described in a hierarchical, tree-like structure. Each node in the tree represents a device, and properties specify details such as the device’s address, interrupt lines, and other configuration parameters.
Fortunately for us, getting said device tree for our target system is pretty easy. Using:
$ qemu-system-riscv64 -machine virt, dumpdtb=qemu-riscv64-virt.dtb
will generate a device tree binary (DTB) file named qemu-riscv64-virt.dtb
and dump it to disk.
The file we generated is in a binary form, we can’t view the contents just yet. Instead we need to convert the binary device tree into a human-readable file, called Device Tree Source (DTS). We can do that by using the Device Tree Compiler (dtc)
:
$ dtc qemu-riscv64-virt.dtb > qemu-riscv64-virt.dts
Now, that we’ve successfully generated our target’s Device Tree, we can inspect every hardware device available to us.
Out of all those devices in there, we are mostly interested for the UART section:
uart@10000000 {
interrupts = <0x0a>;
interrupt-parent = <0x03>;
clock-frequency = "\08@";
reg = <0x00 0x10000000 0x00 0x100>;
compatible = "ns16550a";
};
There’s a lot of interesting information in here. To begin with, we can identify the UART interface accessible at memory address 0x10000000
, marked by the uart@10000000
designation. Moving forward, we can see where the UART interface memory location is, as well for how long its memory extends. As it’s brilliantly explained here [2]
, this region is mapped into the memory from 0x1000_0000
to 0x1000_0100
, signifying a memory extension of 0x00 + 0x100 = 0x100 bytes
.
Finally, the part of the device tree we are also interested is:
memory@80000000 {
device_type = "memory";
reg = <0x00 0x80000000 0x00 0x8000000>;
};
Which, as we saw before, means the memory ranges from 0x8000_0000
to 0x8800_0000
.
Linker Park
Now that we finally know which machine we are actually targetting, and what’s the memory address of its UART interface, we can begin figuring out how exactly are we going to execute our bare-metal application. Writing a simple main
function and compiling our application, won’t work, since there is no the OS to take care of system initialization.
Fortunately for us, we can use the internal linker script from GNU ld
, which supports a number of architectures, by doing:
$ riscv64-unknown-elf-ld --verbose >> linker.ld
After removing the redundant sections at the top and bottom, we can add our findings from the previous section like so:
SECTIONS
{
PROVIDE(__uart_base_addr = 0x10000000);
}
Finally, and most importantly, we’ll have to specify a section called MEMORY
at the start of the linker script, that will define the memory regions available to our program. In this case, it will specify a region named RAM
starting at the address 0x80000000
with a length of 128M (megabytes)
with read
, write
, and execute
permissions (rwx)
.
MEMORY
{
RAM (rwx) : ORIGIN = 0x80000000, LENGTH = 128M
}
Writing our program
It’s finally time to right some code. Since we previously used cargo new
to generate our project, a src/main.rs
file got automatically generated. So let’s edit it:
// Rust compiler should not include the standard library
// in our compiled binary
#![no_std]
// Rust compiler should not generate the standard main function
// as the entry point for the executable
#![no_main]
use core::arch::asm;
use core::panic::PanicInfo;
// Defining a custom panic handler i.e. a function that gets called when a panic occurs
#[panic_handler]
fn panic(_info: &PanicInfo) -> ! {
// infinite loop
loop {}
}
// Explicitly placing our `_start` function in the `.text._start`
// section of our linker.ld
#[link_section = ".text._start"]
// Rust compiler should use the exact, unmangled name as
// the symbol in the compiled output
#[no_mangle]
pub extern "C" fn _start() -> ! {
unsafe {
asm!(
"
li s1, 0x10000000 // s1 = 0x1000_0000 i.e. our UART base address
la s2, message // s2 = <str>
addi s3, s2, 14 // s3 = s2 + 14, Calculating the address of the end of the string
1: // Only labels of the form `<number>:`
// does Rust allow when doing inline assembly.
lb s4, 0(s2) // s4 = (s2) i.e. Loading the byte at the memory
// address stored in s2 into s4
sb s4, 0(s1) // (s1) = s4 i.e. Store the byte from s4 to the
// memory address stored in s1
addi s2, s2, 1 // s2 = s2 + 1
blt s2, s3, 1b // if s2 < s3, branch back to 1
",
// Don't push data to the stack.
// You can real more about what this option does here:
// https://doc.rust-lang.org/reference/inline-assembly.html?search=#options
options(nostack)
);
}
loop{}
}
#[no_mangle]
static message: [u8; 14] = *b"Hello, world!\n";
Let’s write a simple Makefile for compiling our code:
TARGET := target
LINKER_SCRIPT := ./linker.ld
.PHONY: build clean
build:
@echo "Compiling..."
cargo rustc -- -g -C link-arg=--script=$(LINKER_SCRIPT)
clean:
@echo "Cleaning..."
rm -rf $(TARGET)
Running make build
now should produce an ELF like so:
$ file qemu_riscv
qemu_riscv: ELF 64-bit LSB executable, UCB RISC-V, RVC, double-float ABI, version 1 (SYSV),
statically linked, with debug_info, not stripped
Notice the with debug_info
, a result of us using the -g
flag. According to the gcc man page:
-g
: produce debugging information in the operating system’s native format (stabs, COFF, XCOFF, or DWARF). GDB can work with this debugging information.
Running and debugging our code
Now that we have our RISC-V binary, it’s time to set up the QEMU side of things.
$ qemu-system-riscv64 -machine virt
Next, we’ll need to tell QEMU to accept a gdb connection using the -gdb dev
flag.
According to the QEMU man page:
-gdb dev
: accept a gdb connection on device dev (see the GDB usage chapter in the System Emulation Users Guide). Note that this option does not pause QEMU execution â if you want QEMU to not start the guest until you connect with gdb and issue a continue command, you will need to also pass the -S option to QEMU.
-s
: shorthand for -gdb tcp::1234, i.e. open a gdbserver on TCP port 1234
Which leaves us with the following command:
$ qemu-system-riscv64 -machine virt -s -S target/riscv64gc-unknown-none-elf/debug/rust-riscv
📝 Note: Notice the
-S
flag, which halts the execution of our virtual machine, as it waits for a debugger to attach before continuing.
Now that we’ve set up QEMU to accept a gdb connection via TCP, we need to invoke the GDB, and load our Rust program for debugging.
$ riscv64-unknown-elf-gdb target/riscv64gc-unknown-none-elf/debug/rust-riscv
Let’s edit our Makefile, to include these commands:
TARGET := target
CARGO_NAME := rust-riscv
TARGET_DIR := $(TARGET)/riscv64gc-unknown-none-elf/debug/$(CARGO_NAME)
LINKER_SCRIPT := ./linker.ld
.PHONY: build clean qemu qemu-debug gdb
# Emulate our binary, without any debugger capabilities enabled.
qemu: $(TARGET_DIR)
@echo "Running..."
qemu-system-riscv64 -machine virt -bios none -kernel $< -nographic
# Emulate our binary, with debugger capabilities enabled.
qemu-debug: $(TARGET_DIR)
@echo "Running and waiting for GDB"
qemu-system-riscv64 -machine virt -bios none -kernel $< -nographic -s -S
build:
@echo "Compiling..."
cargo rustc -- -g -C link-arg=--script=$(LINKER_SCRIPT)
gdb: $(TARGET_DIR)
@echo "Opening GNU Debugger for remote debugging..."
riscv64-unknown-elf-gdb $<
clean:
@echo "Cleaning..."
rm -rf $(TARGET)
Finally, running:
$ make clean
Cleaning...
rm -rf target
$ make build
Compiling...
cargo rustc -- -g -C link-arg=--script=./linker.ld
Compiling rust-riscv v0.1.0 (/home/petrside/github/riscv-qemu/testing)
Finished dev [unoptimized + debuginfo] target(s) in 0.11s
$ make qemu
Running...
qemu-system-riscv64 -machine virt -bios none -kernel target/riscv64gc-unknown-none-elf/debug/rust-riscv -nographic
Hello, world!
📝 Note:
Ctrl+X
+A
terminates the emulation.
Alternatively, to examine and debug our application running, we can:
# Emulate our binary, with debugger capabilities enabled.
$ make qemu-debug
Running and waiting for GDB
qemu-system-riscv64 -machine virt -bios none -kernel target/riscv64gc-unknown-none-elf/debug/rust-riscv -nographic -s -S
Next, in a separate shell (but in the same directory), let’s open the GNU Debugger for remote debugging, using:
$ make gdb
Opening GNU Debugger for remote debugging...
riscv64-unknown-elf-gdb target/riscv64gc-unknown-none-elf/debug/rust-riscv
...
Reading symbols from target/riscv64gc-unknown-none-elf/debug/rust-riscv...
(gdb) target remote :1234
Remote debugging using :1234
(gdb) break _start
Breakpoint 1 at 0x80000020: file src/main.rs, line 35.
(gdb) continue
Continuing.
Returning to the original console we get:
$ make qemu-debug
Running and waiting for GDB
qemu-system-riscv64 -machine virt -bios none -kernel target/riscv64gc-unknown-none-elf/debug/rust-riscv -nographic -s -S
Hello, world!
You should see the ‘Hello, world!’ message printed to the console.