My day job is writing embedded software, so I do a decent amount of Linux work. However, since the team that I work on was created long before I joined, there’s already a set of tools that builds our Linux image. When I want to test a Linux change I run one build command and out pops a full Linux image. But, if by some tragic accident, all our build code disappeared tomorrow, how would I go about building a Linux image myself? How, exactly, do you go from a Linux source tree and some userspace code to a bootable binary? I wasn’t sure, so I decided to find out!
Before I start, I need to define the final product I’m looking for. I want a Linux kernel that I compiled, running with a device tree I compiled, booting off a file system I made, into a shell. No wifi, no GUI, just a terminal screen that I can type in. Basically, the minimal viable product of a Linux distro.
Now, if you go look up how to build a Linux image, you’re going to come
across two major tools: Buildroot and Yocto. And while I assume that these
tools are very powerful, I already don’t know how to build a Linux image.
And learning how to build a Linux image at the same time as I learn a tool
that builds the Linux image seems a bit much. So, for this exercise, I’m
going to be doing everything by hand (and by hand I mean using
make plus whatever other miscellaneous utilities I need). But
no pre-packaged Linux building system for me!
[1]
There are only really three components that I need to build to boot into a shell.
So with all this in mind, onto step 1!
Before I start building anything, I need to decide which hardware I’m going to be building for. I already have a Raspberry Pi 4 at home, so I decided to go with that (this also means that I could boot the image on real hardware in the future if I desired).
The Raspberry Pi foundation has a page on how to build a Linux kernel, so I started there. [2]
The documentation said to build this for the 64-bit kernel
make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- Image modules dtbs
and it said to build this for the 32-bit kernel
make ARCH=arm CROSS_COMPILE=arm-linux-gnueabihf- zImage modules dtbs
I was a little confused to see two different targets for the kernel; do the 64 and 32-bit kernels not use the same target?
After a little googling it turns out that this can be answered via the help target. Running
make ARCH=arm64 help
Gives these build targets
Architecture-specific targets (arm64):
* Image.gz - Compressed kernel image (arch/arm64/boot/Image.gz)
Image - Uncompressed kernel image (arch/arm64/boot/Image)
and setting ARCH=arm gives these build targets
Architecture-specific targets (arm):
* zImage - Compressed kernel image (arch/arm/boot/zImage)
Image - Uncompressed kernel image (arch/arm/boot/Image)
* xipImage - XIP kernel image, if configured (arch/arm/boot/xipImage)
uImage - U-Boot wrapped zImage
bootpImage - Combined zImage and initial RAM disk
(supply initrd image via make variable INITRD=<path>)
So it seems that zImage and Image.gz are both compressed kernel images, just with different target names, depending on the architecture. Since I’m interested in 64-bit Linux, I’ll follow the guide and run the following commands. [3]
KERNEL=kernel8
make -j $(nproc) ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- bcm2711_defconfig
make -j $(nproc) ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- Image dtbs
I get a file named ‘Image’ once Linux has finished building, and running
file on it reassures me that I’ve compiled the correct image.
file result/boot/Image
result/boot/Image: Linux kernel ARM64 boot executable Image, little-endian, 4K pages
I also get a device tree which seems valid
file result/dtbs/bcm2711-rpi-4-b.dtb
result/dtbs/bcm2711-rpi-4-b.dtb: Device Tree Blob version 17, size=56108, boot CPU=0, string block size=4872, DT structure block size=51164
Nice, my image built successfully!
So I now have a Linux image and a device tree. But what about a file system? I need somewhere to store my shell (and the other utilities that I might want).
When you boot Linux, you need to inform it of where the file system for the image lives. You can use a file system on a physical device (like a hard drive or SSD), or you can use a ram only file system called initramfs. I decided to try the initramfs option. [4]
After reading through the
documentation
on initramfs, it seems like what I need to do is pretty minimal. I need to
create a gzipped cpio archive that will be extracted into a
root file system. That archive needs to contain a script called
/init, which will be what Linux runs after it’s unpacked my
archive. Since the archive will be unpacked into a file system, it needs
to contain all the programs I want access to, such as cp,
ls, and, given that I want a shell, sh. But
there are a lot of tools that come bundled with a standard Linux system.
Am I going to need to build all these from source?
As you may have guessed by my leading question, the answer is no, I don’t need to build these all from source. All I need is to use BusyBox.
To quote the busybox website
BusyBox combines tiny versions of many common UNIX utilities into a single small executable. It provides replacements for most of the utilities you usually find in GNU fileutils, shellutils, etc. The utilities in BusyBox generally have fewer options than their full-featured GNU cousins; however, the options that are included provide the expected functionality and behave very much like their GNU counterparts. BusyBox provides a fairly complete environment for any small or embedded system.
And luckily for me, those “many common” utilities happen to include all the programs I need for my shell to be useful! This means that I don’t need to ship a few hundred different binaries in my initramfs image, I can just ship BusyBox. But how does BusyBox provide all the functionality of the various tools that I want access to?
argv[0]
Normally argv[0] isn’t used for anything; it’s just the name
of your program, after all. And why would you ever care about the name of
the program you’re running?
This is some C code that just prints out argv[0]
#include <stdio.h>
int main(int argc, char** argv)
{
printf("%s\n", argv[0]);
}
when I run it I get
~/scratch$ ./a.out
./a.out
which isn’t very interesting. a.out is the name of the
program. What else could it print?
However, an interesting thing happens when you create symlinks to a program.
~/scratch$ tree
.
├── a.out
└── foo -> a.out
Now, if I run foo (which is just a.out), I get this
~/scratch$ ./foo
./foo
Well isn’t that interesting. I only have one binary, but I can change
argv[0] by invoking the same binary through a symlink.
Imagine if I wanted to allow one binary to do multiple things, depending
on which symlink it’s invoked through. I could mimic having multiple
binaries by checking the value of argv[0] and taking the
appropriate action depending on the value.
[5]
if argv[0] == "ls":
call_ls()
else if argv[0] == "cp"
call_cp()
// repeat for all other utilities
This is how BusyBox works; BusyBox has a few hundred programs inside of
it, and you symlink each program name to the BusyBox binary. When you
invoke BusyBox via the relevant symlink, BusyBox checks
argv[0] and calls the appropriate sub program for you. This
means that I can ship one binary - BusyBox - but have access to all the
tools that BusyBox contains internally. Listing out the programs BusyBox
includes shows almost every program I’ve ever used
Usage: busybox [function [arguments]...]
or: busybox --list[-full]
or: busybox --show SCRIPT
or: busybox --install [-s] [DIR]
or: function [arguments]...
BusyBox is a multi-call binary that combines many common Unix
utilities into a single executable. Most people will create a
link to busybox for each function they wish to use and BusyBox
will act like whatever it was invoked as.
Currently defined functions:
[, [[, acpid, add-shell, addgroup, adduser, adjtimex, arch, arp,
arping, ascii, ash, awk, base32, base64, basename, bc, beep,
blkdiscard, blkid, blockdev, bootchartd, brctl, bunzip2, bzcat, bzip2,
cal, cat, chat, chattr, chgrp, chmod, chown, chpasswd, chpst, chroot,
chrt, chvt, cksum, clear, cmp, comm, conspy, cp, cpio, crc32, crond,
crontab, cryptpw, cttyhack, cut, date, dc, dd, deallocvt, delgroup,
deluser, depmod, devmem, df, dhcprelay, diff, dirname, dmesg, dnsd,
dnsdomainname, dos2unix, dpkg, dpkg-deb, du, dumpkmap, dumpleases,
echo, ed, egrep, eject, env, envdir, envuidgid, ether-wake, expand,
expr, factor, fakeidentd, fallocate, false, fatattr, fbset, fbsplash,
fdflush, fdformat, fdisk, fgconsole, fgrep, find, findfs, flock, fold,
free, freeramdisk, fsck, fsck.minix, fsfreeze, fstrim, fsync, ftpd,
ftpget, ftpput, fuser, getopt, getty, grep, groups, gunzip, gzip, halt,
hd, hdparm, head, hexdump, hexedit, hostid, hostname, httpd, hush,
hwclock, i2cdetect, i2cdump, i2cget, i2cset, i2ctransfer, id, ifconfig,
ifdown, ifenslave, ifplugd, ifup, inetd, init, insmod, install, ionice,
iostat, ip, ipaddr, ipcalc, ipcrm, ipcs, iplink, ipneigh, iproute,
iprule, iptunnel, kbd_mode, kill, killall, killall5, klogd, last, less,
link, linux32, linux64, linuxrc, ln, loadfont, loadkmap, logger, login,
logname, logread, losetup, lpd, lpq, lpr, ls, lsattr, lsmod, lsof,
lspci, lsscsi, lsusb, lzcat, lzma, lzop, makedevs, makemime, man,
md5sum, mdev, mesg, microcom, mim, mkdir, mkdosfs, mke2fs, mkfifo,
mkfs.ext2, mkfs.minix, mkfs.vfat, mknod, mkpasswd, mkswap, mktemp,
modinfo, modprobe, more, mount, mountpoint, mpstat, mt, mv, nameif,
nanddump, nandwrite, nbd-client, nc, netstat, nice, nl, nmeter, nohup,
nologin, nproc, nsenter, nslookup, ntpd, od, openvt, partprobe, passwd,
paste, patch, pgrep, pidof, ping, ping6, pipe_progress, pivot_root,
pkill, pmap, popmaildir, poweroff, powertop, printenv, printf, ps,
pscan, pstree, pwd, pwdx, raidautorun, rdate, rdev, readahead,
readlink, readprofile, realpath, reboot, reformime, remove-shell,
renice, reset, resize, resume, rev, rm, rmdir, rmmod, route, rpm,
rpm2cpio, rtcwake, run-init, run-parts, runlevel, runsv, runsvdir, rx,
script, scriptreplay, sed, seedrng, sendmail, seq, setarch, setconsole,
setfattr, setfont, setkeycodes, setlogcons, setpriv, setserial, setsid,
setuidgid, sh, sha1sum, sha256sum, sha3sum, sha512sum, showkey, shred,
shuf, slattach, sleep, smemcap, softlimit, sort, split, ssl_client,
start-stop-daemon, stat, strings, stty, su, sulogin, sum, sv, svc,
svlogd, svok, swapoff, swapon, switch_root, sync, sysctl, syslogd, tac,
tail, tar, taskset, tc, tcpsvd, tee, telnet, telnetd, test, tftp,
tftpd, time, timeout, top, touch, tr, traceroute, traceroute6, tree,
true, truncate, ts, tsort, tty, ttysize, tunctl, ubiattach, ubidetach,
ubimkvol, ubirename, ubirmvol, ubirsvol, ubiupdatevol, udhcpc, udhcpc6,
udhcpd, udpsvd, uevent, umount, uname, unexpand, uniq, unix2dos,
unlink, unlzma, unshare, unxz, unzip, uptime, users, usleep, uudecode,
uuencode, vconfig, vi, vlock, volname, w, wall, watch, watchdog, wc,
wget, which, who, whoami, whois, xargs, xxd, xz, xzcat, yes, zcat,
zcip
All I need to do to get access to all of these tools is make sure that my
/init script sets up the relevant symlinks before starting
the shell
I don’t plan on packaging a C standard library in my system, so I need to make sure the BusyBox is compiled statically (otherwise BusyBox will search for a non-existent system-wide C standard library)
Luckily this isn’t that hard. To compile BusyBox I first clone the repo and then run
make defconfig
which produces a .config file. When I then open up this
.config I see this
#
# Build Options
#
# CONFIG_STATIC is not set
Changing this to
CONFIG_STATIC=y
means that my BusyBox image will now build statically.
I then build BusyBox using this command
make CROSS_COMPILE=aarch64-unknown-linux-gnu- -j $(nproc)
After building, I can verify that it is indeed a static binary.
file result/busybox
result/busybox: ELF 64-bit LSB executable, ARM aarch64, version 1 (GNU/Linux), statically linked, for GNU/Linux 3.10.0, stripped
Now that I’ve gotten BusyBox, I can actually create my initramfs image, which will contain the following
/init file (called at startup)BusyBox/init script
needs
Putting all that together results in a file system that looks like this
.
├── bin
│ ├── busybox
│ ├── ln -> busybox
│ ├── ls -> busybox
│ └── sh -> busybox
├── init
With this as the init script
#!/bin/sh
for command in $(busybox --list); do
if [ ! -e "/bin/$command" ]; then
ln -s busybox "/bin/$command"
fi
done
exec /bin/sh
This script creates symlinks from the programs BusyBox packages to the
BusyBox binary itself. It then invokes /bin/sh, which creates
the shell that I’ll interact with.
Now that I’ve got the file system set up I need to package it in a way that Linux understands. Luckily the initramfs documentation provides a script for that!
#!/bin/sh
# Copyright 2006 Rob Landley <[email protected]> and TimeSys Corporation.
# Licensed under GPL version 2
if [ $# -ne 2 ]
then
echo "usage: mkinitramfs directory imagename.cpio.gz"
exit 1
fi
if [ -d "$1" ]
then
echo "creating $2 from $1"
(cd "$1"; find . | cpio -o -H newc | gzip) > "$2"
else
echo "First argument must be a directory"
exit 1
fi
This script takes two arguments: a directory to convert to an initramfs image, and what you want the resulting compressed cpio file to be called. [6]
Running this script on my file system gives me a compressed cpio archive,
which seems to be valid (I called my initramfs file
init.cpio)
file init.cpio
init.cpio: gzip compressed data, from Unix, original size modulo 2^32 3069952
I now have all the pieces I need to actually boot the system. I have a kernel, a device tree, and an initramfs file system. Now it’s time to put it all together.
To test out the image, I need real hardware or an emulator. In this case I’m going to use QEMU, which is an emulator that can run a full kernel on a virtual Raspberry Pi, without needing to set up any real hardware. [7]
I can start the kernel by running
qemu-system-aarch64
-nographic \
-machine raspi4b \
-cpu cortex-a72 \
-m 2G -smp 4 \
-kernel result/boot/Image \
-dtb result/dtbs/bcm2711-rpi-4-b.dtb \
--initrd scratch/init.cpio \
-serial null \
-chardev stdio,id=uart1 \
-serial chardev:uart1 \
-monitor none
The important points here are
--kernel flag
--dtb flag
--initrd flag
And after some waiting I see
[ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd083]
[ 0.000000] KASLR disabled due to lack of seed [ 0.000000] Machine model: Raspberry Pi 4 Model B
[ 0.000000] efi: UEFI not found.
[ 0.000000] Reserved memory: created CMA memory pool at 0x000000002c000000, size 64 MiB
[ 0.000000] OF: reserved mem: initialized node linux,cma, compatible id shared-dma-pool
[ 0.000000] OF: reserved mem: 0x000000002c000000..0x000000002fffffff (65536 KiB) map reusable linux,cma
[ 0.000000] NUMA: No NUMA configuration found
[ 0.000000] NUMA: Faking a node at [mem 0x0000000000000000-0x000000003bffffff]
[ 0.000000] NUMA: NODE_DATA [mem 0x3bdd33c0-0x3bdd5fff]
[ 0.000000] Zone ranges:
[ 0.000000] DMA [mem 0x0000000000000000-0x000000003bffffff]
[ 0.000000] DMA32 empty
[ 0.000000] Normal empty
[ 0.000000] Movable zone start for each node
// ignore most of the output
[ 1.410530] of_cfs_init
[ 1.413332] of_cfs_init: OK
[ 1.416512] clk: Disabling unused clocks
|[ 1.474364] Freeing unused kernel memory: 4864K
[ 1.477876] Run /init as init process
/bin/sh: can't access tty; job control turned off
~ #
I have a shell!! And I can verify that everything is working by doing the time-honored tradition of hello world.
~ # echo hello world
hello world
And I’m done! I have a full - although limited - Linux image that boots!
At this point I've achieved what I set out to do. I now have a minimal Linux system that I can use to boot into a shell!
From here there's a bunch of different directions that I could go. I could look at integrating a system wide C standard library, so I could have dynamically linked executables. I could try and get networking setup, so I can ssh into the system. Or I could investigate other init systems, like systemD.
This won't be the last time that I'm writing about Linux, but for now this is a great place to stop.