Tutorial On Rump Kernel Servers and Clients
- Introduction
- Important concepts and a warmup exercise
- Userspace cgd encryption
- Networking
- Emulating makefs
- Master class: NFS server
- Further ideas
Introduction
The rump anykernel architecture allows to run highly componentized kernel code configurations in userspace processes. Coupled with the rump sysproxy facility it is possible to run loosely distributed client-server "mini-operating systems". Since there is minimum configuration and the bootstrap time is measured in milliseconds, these environments are very cheap to set up, use, and tear down on-demand.
This document acts as a tutorial on how to configure and use
unmodified NetBSD kernel drivers as userspace services with utilities
available from the NetBSD base system. As part of this, it presents
various use cases. One uses the kernel cryptographic
disk driver (cgd) to encrypt a partition. Another one demonstrates
how to operate an FFS server for editing the contents of a file
system even though your user account does not have privileges to
use the host's mount()
system call. Additionally,
using a userspace TCP/IP server with an unmodified web browser is
detailed.
The minimum NetBSD source version which supports everything described in this document is -current starting from mid-March 2011 (NetBSD 5.99.48 and later). The tutorial applies to all hardware architectures supported by NetBSD.
Important concepts and a warmup exercise
This section goes over basic concepts which help to understand how to start and use rump servers and clients.
A rump kernel service location is specified with an URL. Currently,
two types of connections are supported: TCP and local domain (i.e.
file system) sockets. TCP connections use standard TCP/IP addressing.
The URL is of the format tcp://ip.address:port/
.
A local domain socket binds to a pathname on the local system.
The URL format is unix://path
and accepts
both relative and absolute paths. Note that absolute paths require three
leading slashes.
Both the client and the server require a service URL to be specified. For the server, the URL designates where the server should listen for incoming connections, and for the client it specifies which server the client should connect to.
Kernel services are provided by rump servers. Generally speaking, any driver-like kernel functionality can be offered by a rump server. Examples include file systems, networking protocols, the audio subsystem and USB hardware device drivers. A rump server is absolutely standalone and running one does not require for example the creation and maintenance of a root file system.
rump_server is a component-oriented rump kernel server. It can use
any combination of available NetBSD kernel components in userspace.
In its most basic mode rump server offers only bare-bones
functionality such as kernel memory allocation and thread
support — generally speaking nothing that is alone useful
for applications.
Components are dynamically loaded on the command line using a
linker-like syntax. For example, for a server with FFS capability,
you need VFS support and the FFS component: rump_server
-lrumpvfs -lrumpfs_ffs
— a bare-bones
rump_server does not have file system support and will not even
be able to perform open()
(note: networking
servers do not require VFS support). The -l
option uses
the host's dlopen()
routine to load and link
components dynamically. It is also possible to use the NetBSD
kernel loader/linker to load ELF objects by supplying
-m
instead, but for simplicity this article always
uses -l
.
The URL the server listens to is supplied as the last argument on the command line. The URL follows the format described in the previous section.
Other options control things like number of virtual CPUs configured
to the rump server and maximum amount of host memory the virtual
kernel will allocate. They are documented in the
manual page of rump_server
.
Rump clients are programs which interface with the kernel servers. They can either be used to configure the server or act as consumers of the functionality provided by the server. Configuring the IP address for a TCP/IP server is an example of the former, while web browsing is an example of the latter. Clients can be considered to be the userland of a rump kernel, but unlike in a usermode operating system they are not confined to a specific file system setup, and are simply run from the hosting operating system.
A client determines the server it connects to by examining the URL
in the RUMP_SERVER
environment variable.
A client runs as a hybrid in both the host kernel and rump kernel.
It uses essential functionality from the rump
kernel, while all non-essential functionality comes from the host
kernel.
The direct use of the
host's resources for non-essential functionality enables very
lightweight services and is what sets rump apart from other forms
of virtualization.
The set of essential functionality depends on the
application. For example, for ls
fetching a
directory listing with getdents()
is essential
functionality, while allocating the memory to which the directory
contents are fetched to is non-essential.
The NetBSD base system contains applications which are preconfigured to
act as rump clients. This means that just setting
RUMP_SERVER
will cause these applications to perform
their essential functionality on the specified rump kernel server.
These applications are distiguished by a "rump."-prefix in their
command name. As of writing this the list of pure rump clients is:
rump.cgdconfig rump.halt rump.modunload rump.raidctl rump.traceroute rump.dd rump.ifconfig rump.netstat rump.route rump.dhcpclient rump.modload rump.ping rump.sockstat rump.envstat rump.modstat rump.powerd rump.sysctl
Additionally, almost any other dynamically linked binary can act as a rump client, but it is up to the user to specify a correct configuration for hijacking the application's essential functionality. Hijacking is demonstrated in later sections of this document.
The current scheme gives all connecting clients root credentials. It is recommended to take precautions which prevent unauthorized access. For a unix domain socket it is enough to prevent access to the socket using file system permissions. For TCP/IP sockets the only available means is to prevent network access to the socket with the use of firewalls. More fine-grained access control based on cryptographic credentials may be implemented at a future date.
Putting everything together, we're ready to start our first rump server. After startup, we examine the autogenerated hostname it was given, and halt the server. We also observe that the socket is removed when the server exits.
golem> rump_server unix://rumpserver golem> ls -l rumpserver srwxr-xr-x 1 pooka users 0 Mar 11 14:49 rumpserver golem> sysctl kern.hostname kern.hostname = golem.localhost golem> export RUMP_SERVER=unix://rumpserver golem> rump.sysctl kern.hostname kern.hostname = rump-06341.golem.localhost.rumpdomain golem> rump.halt golem> rump.sysctl kern.hostname rump.sysctl: prog init failed: No such file or directory golem> ls -l rumpserver ls: rumpserver: No such file or directory
As an exercise, try the above, but
halt
with -d
to
produce a core dump. Examine the core with gdb
and especially look at the various thread that were running
(in gdb: thread apply all bt
). Also, try to
create another core with kill -ABRT
. Notice that you
will have a stale socket in the file system when the server is
violently killed. You can remove it with rm
.
As a final exercise, start the server with -s
.
This causes the server to not detach from the console. Then kill
it either with SIGTERM
from another window
(the default signal send by kill
) or by pressing
Ctrl-C. You will notice that the server reboots itself cleanly in
both cases. If it had file systems, those would be unmounted too.
These features are useful for quick iteration when debugging
and developing kernel code.
In case you want to use a debugger to further examine later cases
we go over in this tutorial, it is recommended you install debugging
versions of rump components. That can be done simply by going into
src/sys/rump
and running make DBG=-g
cleandir dependall
and after that make install
as root. You can also install the debugging versions to an alternate
directory using make DESTDIR=/my/dir install
and run
the code with LD_LIBRARY_PATH
set to
/my/dir
. This scheme also allows you to run
kernel servers with non-standard code modifications on a non-privileged
account.
Userspace cgd encryption
The cryptographic disk driver, cgd, provides an encrypted view of a block device. The implementation is kernel-based. This makes it convenient and efficient to layer the cryptodriver under a file system so that all file system disk access is encrypted. However, using a kernel driver requires that the code is loaded into the kernel and that a user has the appropriate privileges to configure and access the driver.
Occasionally, it is desirable to encrypt a file system image before distribution. Assume you have a USB image, i.e. one that can boot and run directly from USB media. The image can for example be something you created yourself, or even one of the standard USB installation images offered by NetBSD. You also have a directory tree with confidential data you wish to protect with cgd. This example demonstrates how to use a rump cgd server to encrypt your data. This approach, as opposed to using a driver in the host kernel, has the following properties:
- uses out-of-the-box tools on any NetBSD installation
- does not require any special kernel drivers
- does not require superuser access
- is portable to non-NetBSD systems (although requires some amount of work)
While there are multiple steps with a fair number of details, in case you plan on doing this regularly, it is possible to script them and automate the process. It is recommended that you follow these instructions as non-root to avoid accidentally overwriting a cgd partition on your host due to a mistyped command.
Let's start with the USB disk image you have. It will have a disklabel such as the following:
golem> disklabel usb.img # usb.img: type: unknown disk: USB image label: flags: bytes/sector: 512 sectors/track: 63 tracks/cylinder: 16 sectors/cylinder: 1008 cylinders: 1040 total sectors: 1048576 rpm: 3600 interleave: 1 trackskew: 0 cylinderskew: 0 headswitch: 0 # microseconds track-to-track seek: 0 # microseconds drivedata: 0 16 partitions: # size offset fstype [fsize bsize cpg/sgs] a: 981792 63 4.2BSD 1024 8192 0 # (Cyl. 0*- 974*) b: 66721 981855 swap # (Cyl. 974*- 1040*) c: 1048513 63 unused 0 0 # (Cyl. 0*- 1040*) d: 1048576 0 unused 0 0 # (Cyl. 0 - 1040*)
Our goal is to add another partition after the existing ones to contain the cgd-encrypted data. This will require extending the file on which the image resides, and naturally a large enough USB mass storage to fit the new image.
First, we create a file system image using the makefs command:
golem> makefs unencrypted.ffs preciousdir Calculated size of `unencrypted.ffs': 12812288 bytes, 696 inodes Extent size set to 8192 unencrypted.ffs: 12.2MB (25024 sectors) block size 8192, fragment size 1024 using 1 cylinder groups of 12.22MB, 1564 blks, 768 inodes. super-block backups (for fsck -b #) at: 32, Populating `unencrypted.ffs' Image `unencrypted.ffs' complete
Then, we figure out the image size in disk sectors:
golem> expr `stat -f %z unencrypted.ffs` / 512 25024
We then edit the existing image label so that there is a spare partition large enough to hold the image. We need to edit "total sectors", and the "c" and "d" partition. We also need to create the "e" partition. Make sure you use "unknown" instead of "unused" as the fstype for for partition e.
golem> disklabel -re usb.img # usb.img: type: unknown disk: USB image label: flags: bytes/sector: 512 sectors/track: 63 tracks/cylinder: 16 sectors/cylinder: 1008 cylinders: 1040 total sectors: 1073600 rpm: 3600 interleave: 1 trackskew: 0 cylinderskew: 0 headswitch: 0 # microseconds track-to-track seek: 0 # microseconds drivedata: 0 16 partitions: # size offset fstype [fsize bsize cpg/sgs] a: 981792 63 4.2BSD 1024 8192 0 # (Cyl. 0*- 974*) b: 66721 981855 swap # (Cyl. 974*- 1040*) c: 1073537 63 unused 0 0 # (Cyl. 0*- 1065*) d: 1073600 0 unused 0 0 # (Cyl. 0 - 1065*) e: 25024 1048576 unknown 0 0 # (Cyl. 1040*- 1065*)
Now, it is time to start a rump server for writing the encrypted
data to the image. We need to note that a rump kernel has a local
file system namespace and therefore cannot in its natural state
see files on the host. However, the -d
parameter
to rump_server can be used to map files from the host into the
rump kernel file system namespace. We start the server in the
following manner:
golem> export RUMP_SERVER=unix:///tmp/cgdserv golem> rump_server -lrumpvfs -lrumpkern_crypto -lrumpdev -lrumpdev_disk \ -lrumpdev_cgd -d key=/dk,hostpath=usb.img,disklabel=e ${RUMP_SERVER}
This maps partition "e" from the disklabel on usb.img
to the key /dk
inside the rump kernel. In
other words, accessing sector 0 from /dk
in
the rump kernel namespace will access sector 1048576 on
usb.img
. The image file is also automatically
extended so that the size is large enough to contain the entire
partition.
Note that everyone who has access to the server socket will have root access to the kernel server, and hence the data you are going to encrypt. In case you are following these instructions on a multiuser server, it is a good idea to make sure the socket is in a directory only you have access to (directory mode 0700).
We can now verify that we get a zero-filled partition of the right size:
golem> rump.dd if=/dk bs=64k > emptypart 195+1 records in 195+1 records out 12812288 bytes transferred in 0.733 secs (17479246 bytes/sec) golem> hexdump -x emptypart 0000000 0000 0000 0000 0000 0000 0000 0000 0000 * 0c38000
In the above example we could pipe rump.dd
output directly
to hexdump
. However, running two separate commands
also conveniently demonstrates that we get the right amount of data
from /dk
.
If we were to dd
our unencrypted.img
to /dk
, we would have added a
regular unencrypted partition to the image. The next step is to
configure a cgd so that we can write encrypted data to the partition.
In this example we'll use a password-based key, but you are free to
use anything that is supported by
cgdconfig.
golem> rump.cgdconfig -g aes-cbc > usb.cgdparams
golem> cat usb.cgdparams
algorithm aes-cbc;
iv-method encblkno1;
keylength 128;
verify_method none;
keygen pkcs5_pbkdf2/sha1 {
iterations 325176;
salt AAAAgGc4DWwqXN4t0eapskSLWTs=;
};
Note that if you have a fast machine and wish to use the resulting encrypted partition on slower machines, it is a good idea to edit "iterations". The value is automatically calibrated by cgdconfig so that encryption key generation takes about one second on the platform the params file is generated with. This can take significantly longer on slower systems. (More information about the iteration count is available here.)
The next step is to configure the cgd device using the paramsfile. Since we are using password-based encryption we will be prompted for a password. Enter any password you want to use to access the data later.
golem> rump.cgdconfig cgd0 /dk usb.cgdparams /dk's passphrase:
If we repeat the dd test in the encrypted partition we will get a very different result than above. This is expected, since now we have an encrypted view of the zero-filled partition.
golem> rump.dd if=/dev/rcgd0d | hexdump -x | sed 8q 0000000 9937 5f33 25e7 c341 3b67 c411 9d73 645c 0000010 5b7c 23f9 b694 e732 ce0a 08e0 9037 2b2a * 0000200 0862 ee8c eafe b21b c5a3 4381 cdb5 2033 0000210 5b7c 23f9 b694 e732 ce0a 08e0 9037 2b2a * 0000400 ef06 099d 328d a35d f4ab aac0 6aba d673 0000410 5b7c 23f9 b694 e732 ce0a 08e0 9037 2b2a
NOTE: The normal rules for the raw device names apply, and the
correct device path is /dev/rcgd0c
on non-x86 archs.
To encrypt our image, we simply need to dd it to the cgd partition.
golem> dd if=unencrypted.ffs bs=64k | rump.dd of=/dev/rcgd0d bs=64k 195+1 records in 195+1 records out 12812288 bytes transferred in 0.890 secs (14395829 bytes/sec) 195+1 records in 195+1 records out 12812288 bytes transferred in 0.896 secs (14299428 bytes/sec)
We have now successfully written an encrypted version of the file system to the image file and can proceed to shut down the rump server. This makes sure all rump kernel caches are flushed.
golem> rump.halt golem> unset RUMP_SERVER
You will need to make sure the cgd params file is available on the platform you intend to use the image on. There are multiple ways to do this. It is safe even to offer the params file for download with the image — just make sure the password is not available for download. Notably, though, you will be telling everyone how the image was encrypted and therefore lose the benefit of two-factor authentication.
In this example we use fs-utils (the latest version is available from othersrc) to copy the file to the unencrypted "a" partition. Like other utilities in this tutorial, fs-utils works purely in userspace and does not require special privileges or kernel support.
golem> fsu_put usb.img%DISKLABEL:a% usb.cgdparams root/ golem> fsu_ls usb.img%DISKLABEL:a% -l root/usb.cgdparams -rw-r--r-- 1 pooka users 175 Feb 9 17:50 root/usb.cgdparams golem> fsu_chown usb.img%DISKLABEL:a% 0:0 root/usb.cgdparams golem> fsu_ls usb.img%DISKLABEL:a% -l root/usb.cgdparams -rw-r--r-- 1 root wheel 175 Feb 9 17:50 root/usb.cgdparams
Alternatively, we could use the method described later in this document which works purely with base system utilities.
We are ready to copy the image to a USB stick. This step should be executed with appropriate privileges for raw writes to USB media. If USB access is not possible on the same machine, the image may be copied over network to a suitable machine.
golem# dd if=usb.img of=/dev/rsd0d bs=64k 8387+1 records in 8387+1 records out 549683200 bytes transferred in 122.461 secs (4488638 bytes/sec)
Finally, we can boot the target machine from the USB stick, configure the encrypted partition, mount the file system, and access the data. Note that to perform these operations we need root privileges on the target machine, since we are using the in-kernel drivers.
demogorgon# cgdconfig cgd0 /dev/sd0e /root/usb.cgdparams /dev/sd0e's passphrase: demogorgon# disklabel cgd0 # /dev/rcgd0d: type: cgd disk: cgd label: fictitious flags: bytes/sector: 512 sectors/track: 2048 tracks/cylinder: 1 sectors/cylinder: 2048 cylinders: 12 total sectors: 25024 rpm: 3600 interleave: 1 trackskew: 0 cylinderskew: 0 headswitch: 0 # microseconds track-to-track seek: 0 # microseconds drivedata: 0 4 partitions: # size offset fstype [fsize bsize cpg/sgs] a: 25024 0 4.2BSD 0 0 0 # (Cyl. 0 - 12*) d: 25024 0 unused 0 0 # (Cyl. 0 - 12*) disklabel: boot block size 0 disklabel: super block size 0 demogorgon# mount /dev/cgd0a /mnt demogorgon#
On a real hardware platform the result looks like this. Happy cgd'ing!
Networking
This section explains how to run any dynamically linked networking program against a rump TCP/IP stack without requiring any modifications to the application, including no recompilation. The application we use in this example is the Firefox browser. It is an interesting application for multiple reasons. Segregating the web browser to its own TCP/IP stack is an easy way to increase monitoring and control over what kind of connections the web browser makes. It is also an easy way to get some increased privacy protection (assuming the additional TCP/IP stack can have its own external IP). Finally, a web browser is largely "connectionless", meaning that once a page has been loaded a TCP/IP connection can be discarded. We use this property to demonstrate killing and restarting the TCP/IP stack from under the application.
A rump server with TCP/IP capability is required. If the plan is to access the internet, the virt interface must be present in the rump kernel and the host kernel must have support for tap and bridge. You also must have the appropriate privileges for configuring the setup — while rump kernels do not themselves require privileges, they cannot magically access host resources without the appropriate privileges. If you do not want to access the internet, using the shmif interface is enough and no privileges are required. However, for purposes of this tutorial we will assume you want to access the internet.
Finally, if there is a desire to configure the rump TCP/IP stack with DHCP, the rump kernel must support bpf. Since bpf is accessed via a file system device node, vfs support is required in this case (without bpf there is no need for file system support). Putting everything together, the rump kernel command line looks like this:
rump_server -lrumpnet -lrumpnet_net -lrumpnet_netinet # TCP/IP networking -lrumpvfs -lrumpdev -lrumpdev_bpf # bpf support -lrumpnet_virtif # virt(4)
So, to start the TCP/IP server execute the following.
Make sure RUMP_SERVER
stays set in the shell you
want to use to access the rump kernel.
golem> export RUMP_SERVER=unix:///tmp/netsrv golem> rump_server -lrumpnet -lrumpnet_net -lrumpnet_netinet -lrumpvfs -lrumpdev -lrumpdev_bpf -lrumpnet_virtif ${RUMP_SERVER}
The TCP/IP server is now running and waiting for clients at
RUMP_SERVER
. For applications to be able to use
it, we must do what we do to a regular host kernel TCP/IP stack:
configure it. This is discussed in the next section.
A kernel mode TCP/IP stack typically has access to networking hardware
for sending and receiving packets, so first we must make sure the
rump TCP/IP server has the same capability. The canonical way is
to use bridging and we will present that here. An alternative is
to use the host kernel to route the packets, but that is left as
an exercise to the reader. In both cases, the rump kernel sends
and receives external packets via a /dev/tap<n>
device node. The rump kernel must have read-write access to this
device node. The details are up to you, but the recommended way is
to use appropriate group privileges.
To create a tap interface and attach it via bridge to a host Ethernet interface we execute the following commands. You can attach as many tap interfaces to a single bridge as you like. For example, if you run multiple rump kernels on the same machine, adding all the respective tap interfaces on the same bridge will allow the different kernels to see each others' Ethernet traffic.
Note that the actual interface names will vary depending on your system and which tap interfaces are already in use.
golem# ifconfig tap0 create golem# ifconfig tap0 up golem# ifconfig tap0 tap0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500 address: f2:0b:a4:f1:da:00 media: Ethernet autoselect golem# ifconfig bridge0 create golem# brconfig bridge0 add tap0 add re0 golem# brconfig bridge0 up golem# brconfig bridge0 bridge0: flags=41<UP,RUNNING> Configuration: priority 32768 hellotime 2 fwddelay 15 maxage 20 ipfilter disabled flags 0x0 Interfaces: re0 flags=3<LEARNING,DISCOVER> port 2 priority 128 tap0 flags=3<LEARNING,DISCOVER> port 4 priority 128 Address cache (max cache: 100, timeout: 1200): b2:0a:53:0b:0e:00 tap0 525 flags=0<> go:le:ms:re:0m:ac re0 341 flags=0<>
That takes care of support on the host side. The next task is to create an interface within the rump kernel which uses the tap interface we just created. In case you are not using tap0, you need to know that virt<n> always corresponds to the host's tap<n>.
golem> rump.ifconfig virt0 create golem> rump.ifconfig virt0 virt0: flags=8802<BROADCAST,SIMPLEX,MULTICAST> mtu 1500 address: b2:0a:bb:0b:0e:00
In case you do not have permission to open the corresponding tap device on the host, or the host's tap interface has not been created, you will get an error from ifconfig when trying to create the virt interface.
Ok, so the rump kernel interface exists. The final step is to configure an address and routing. In case there is DHCP support on the network you bridged the rump kernel to, you can simply run rump.dhcpclient:
golem> rump.dhcpclient virt0 virt0: adding IP address 192.168.2.125/24 virt0: adding route to 192.168.2.0/24 virt0: adding default route via 192.168.2.1 lease time: 172800 seconds (2.00 days)
If there is no DHCP service available, you can do the same manually with the same result.
golem> rump.ifconfig virt0 inet 192.168.2.125 netmask 0xffffff00 golem> rump.route add default 192.168.2.1 add net default: gateway 192.168.2.1
You should now have network access via the rump kernel. You can verify this with a simple ping.
golem> rump.ping www.NetBSD.org PING www.NetBSD.org (204.152.190.12): 56 data bytes 64 bytes from 204.152.190.12: icmp_seq=0 ttl=250 time=169.102 ms 64 bytes from 204.152.190.12: icmp_seq=1 ttl=250 time=169.279 ms 64 bytes from 204.152.190.12: icmp_seq=2 ttl=250 time=169.633 ms ^C ----www.NetBSD.org PING Statistics---- 3 packets transmitted, 3 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 169.102/169.338/169.633/0.270 ms
In case everything is working fine, you will see the same latency as with the host networking stack.
golem> ping www.NetBSD.org PING www.NetBSD.org (204.152.190.12): 56 data bytes 64 bytes from 204.152.190.12: icmp_seq=0 ttl=250 time=169.134 ms 64 bytes from 204.152.190.12: icmp_seq=1 ttl=250 time=169.281 ms 64 bytes from 204.152.190.12: icmp_seq=2 ttl=250 time=169.497 ms ^C ----www.NetBSD.org PING Statistics---- 3 packets transmitted, 3 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 169.134/169.304/169.497/0.183 ms
We are now able to run arbitrary unmodified applications using the
TCP/IP stack provided by the rump kernel. We just have to set
LD_PRELOAD
to instruct the dynamic linker to load
the rump hijacking library. Also, now is a good time to make sure
RUMP_SERVER
is still set and points to the right
place.
golem> export LD_PRELOAD=/usr/lib/librumphijack.so
Congratulations, that's it. Any application you run from the shell
in which you set the variables will use the rump TCP/IP stack. If you
wish to use another rump TCP/IP server (which has networking
configured), simply adjust RUMP_SERVER
. Using this
method you can for example segregate some "evil" applications to
their own networking stack.
Since the TCP/IP stack is running in a separate process from the client, it is possible kill and restart the TCP/IP stack from under the application without having to restart the application. Potential application for this are to take features available in later releases into use or fixing a security vulnerability. Even though NetBSD kernel code barely ever crashes, it does happen, and this will also protect against that.
Since networking stack code does not contain any checkpointing support, killing the hosting process will cause all kernel state to go missing and for example previously used sockets will not be available after restart. Even if checkpointing were added for things like file descriptors, generally speaking checkpointing a TCP connection is not possible. The reaction to this unexpected loss of state largely depends on the application. For example, ssh will not handle this well, but Firefox will generally speaking recover without adverse effects.
Before starting the hijacked application you should instruct the rump
client library to retry to connect to the server in case the
connection is lost. This is done
by setting RUMPHIJACK_RETRYCONNECT
to a value
documented on the manual page.
golem> export RUMPHIJACK_RETRYCONNECT=inftime golem> firefox
Now we can use Firefox just like we would with the host kernel networking stack. When we want to restart the TCP/IP stack, we can use any method we'd like for killing the TCP/IP server, even kill -9 or just having it panic. The client will detect the severed connection and print out the following diagnostic warnings.
rump_sp: connection to kernel lost, trying to reconnect ... rump_sp: still trying to reconnect ... rump_sp: still trying to reconnect ...
Once the server has been restarted, the following message will be printed. If the server downtime was long, the client can take up to 10 seconds to retry, so do not be surprised if you do not see it immediately.
rump_sp: reconnected!
Note that this message only signals that the client has a connection to the server. In case the server has not been configured yet to have an IP address and a gateway, the application will not be able to function regularly. However, when that step is complete, normal service can resume.
Any pages that were loading when the TCP/IP server went down will not finish loading. However, this can be "fixed" simply by reloading the pages.
Emulating makefs
The makefs
command takes a directory tree and creates a file system image out
of it. This groundbreaking utility was developed back when crossbuild
capability was added to the NetBSD source tree. Since makefs
constructs the file system purely in userspace, it does not depend
on the buildhost kernel to have file system support or the build
process to have privileges to mount a file system. However, its
implementation requires many one-way modifications to the kernel
file system driver. Since performing these modifications is
complicated, out of the NetBSD kernel file systems with r/w support
makefs supports only FFS.
This part of the tutorial will show how to accomplish the same with
out-of-the-box binaries. It applies to any r/w kernel file system for which
NetBSD ships a newfs utility capable of creating
image files. We learn how to mount a file system within the hijacked
rump kernel namespace and how to use pax
to copy
files to the file system image.
First, we need a suitable victim directory tree we want to create an image out of. We will again use the nethack source tree as an example. We need to find out how much space the directory tree will require.
golem> du -sh nethack-3.4.3/ 12M nethack-3.4.3/
Next, we need to create an empty file system. We use the standard
newfs
tool for this (command
name will vary depending on target file system type). Since the
file system must also accommodate metadata such as inodes and
directory entries, we will create a slightly larger file system
than what was indicated by du
and reserve roughly 10%
more disk space. There are ways to increase the accuracy of this
calculation, but they are beyond the scope of this document.
golem> newfs -F -s 14M nethack.img nethack.img: 14.0MB (28672 sectors) block size 4096, fragment size 512 using 4 cylinder groups of 3.50MB, 896 blks, 1696 inodes. super-block backups (for fsck_ffs -b #) at: 32, 7200, 14368, 21536,
Now, we need to start a rump server capable of mounting this particular
file system type. As in the cgd example, we map the
host image as /dk
in the rump kernel namespace.
golem> rump_server -lrumpvfs -lrumpfs_ffs -d key=/dk,hostpath=nethack.img,size=host unix:///tmp/ffs_server
Next, we need to configure our shell for rump syscall hijacking.
This is done by pointing the LD_PRELOAD
environment
variable to the hijack library. Every command executed with the
variable set will attempt to contact the rump server and will fail
if the server cannot contacted. This is demonstrated below by first
omitting RUMP_SERVER
and attempting to run a command.
Shell builtins such as export
and unset
can
still be run, since they do not start a new process.
golem> export LD_PRELOAD=/usr/lib/librumphijack.so golem> lua -e 'print("Hello, rump!")' lua: rumpclient init: No such file or directory golem> export RUMP_SERVER=unix:///tmp/ffs_server golem> lua -e 'print("Hello, rump!")' Hello, rump!
Now, we can access the rump kernel file system namespace using the
special path prefix /rump
.
golem> ls -l /rump total 1 drwxr-xr-x 2 root wheel 512 Mar 12 13:31 dev
By default, a rump root file system includes only some autogenerated device nodes based on which components are loaded. As an experiment, you can try the above also against a server which does not support VFS.
We then proceed to create a mountpoint and mount the file system.
Note, we start a new shell here because the one where we set
LD_PRELOAD
in was not executed
with the variable set. That process does not have hijacking
configured and we cannot cd
into /rump
.
There is no reason we could not perform everything without changing
the current working directory, but doing so often means less typing.
golem> $SHELL golem> cd /rump golem> mkdir mnt golem> df -i mnt Filesystem 1K-blocks Used Avail %Cap iUsed iAvail %iCap Mounted on rumpfs 1 1 0 100% 0 0 0% / golem> mount_ffs /dk /rump/mnt mount_ffs: Warning: realpath /dk: No such file or directory golem> df -i mnt Filesystem 1K-blocks Used Avail %Cap iUsed iAvail %iCap Mounted on /dk 13423 0 12752 0% 1 6781 0% /mnt
Note that the realpath warning from mount_ffs
is only a warning and can be ignored. It is a result of the userland
utility trying to find the source device /dk
,
but cannot since it is available only inside the rump kernel. Note
that you need to supply the full path for the mountpoint, i.e.
/rump/mnt
instead of mnt
.
Otherwise the userland mount utility may adjust it incorrectly.
If you run the mount
command you will note that the
mounted file system is not present. This is
expected, since the file system has been mounted within the rump
kernel and not the host kernel, and therefore the host kernel does
not know anything about it. The list of mounted file system is
fetched with the getvfsstat() system call.
Since the system call does not take any pathname, the hijacking
library cannot automatically determine if the user wanted the
mountpoints from the host kernel or the rump kernel. However, it
is possible for the user to configure the behaviour by setting the
RUMPHIJACK
environment variable to contain the string
vfs=getvfsstat
.
golem> env RUMPHIJACK=vfs=getvfsstat mount rumpfs on / type rumpfs (local) /dk on /mnt type ffs (local)
Other ways of configuring the behaviour of system call hijacking
are described on the manual page.
Note that setting the variable will override the default behaviour,
including the ability to access /rump
. You
can restore this by setting the variable to
vfs=getvfsstat,path=/rump
.
Like with LD_PRELOAD
, setting the variable will
affect only processes you run after setting it, and the behaviour
of the shell it was set in will remain unchanged.
Now we can copy the files over. Due to how pax
works, we first change our working directory to avoid encoding the
full source path in the destination. The alternative is use us the
-s
option, but I find that changing the directory is often
simpler.
golem> cd ~/srcdir golem> pax -rw nethack-3.4.3 /rump/mnt/ golem> df -i /rump/mnt/ Filesystem 1K-blocks Used Avail %Cap iUsed iAvail %iCap Mounted on /dk 13423 11962 790 93% 695 6087 10% /mnt
For comparison, we present the same operation using cp
.
Obviously, only one of pax or cp is necessary and you can use
whichever you find more convenient.
golem> cp -Rp ~/srcdir/nethack-3.4.3 mnt/ golem> df -i /rump/mnt/ Filesystem 1K-blocks Used Avail %Cap iUsed iAvail %iCap Mounted on /dk 13423 11962 790 93% 695 6087 10% /mnt
Then, the only thing left is to unmount the file system to make sure that we have a clean file system image.
golem> umount -R /rump/mnt golem> df -i /rump/mnt Filesystem 1K-blocks Used Avail %Cap iUsed iAvail %iCap Mounted on rumpfs 1 1 0 100% 0 0 0% /
It is necessary to give the -R
option to umount,
or it will attempt to adjust the path by itself. This will usually
result in the wrong path and the unmount operation failing.
It is possible to set RUMPHIJACK
in a way which does
not require using -R
, but that is left as an
exercise for the reader.
We do not need to remove the mountpoint since the rump root file system is an in-memory file system and will be removed automatically when we halt the server.
Congratulations, you now have a clean file system image containing the desired files.
Master class: NFS server
This section presents scripts which allow to start a rump kernel capable of serving NFS file systems and how to mount the service using a client connected to another kernel server. At this stage all the relevant pointers to manual pages have been given, so the scripts can merely be presented instead of being explained thoroughly.
#!/bin/sh # # This script starts a rump kernel with NFS serving capability, # configures a network interface and starts hijacked binaries # which are necessary to serve NFS (rpcbind, mountd, nfsd). # # directory used for all temporary stuff NFSX=/tmp/nfsx # no need to edit below this line haltserv() { RUMP_SERVER=unix://${NFSX}/nfsserv rump.halt 2> /dev/null RUMP_SERVER=unix://${NFSX}/nfscli rump.halt 2> /dev/null } die() { haltserv echo $* exit 1 } # start from a fresh table haltserv rm -rf ${NFSX} mkdir ${NFSX} || die cannot mkdir ${NFSX} # create ffs file system we'll be exporting newfs -F -s 10000 ${NFSX}/ffs.img > /dev/null || die could not create ffs # start nfs kernel server. this is a mouthful export RUMP_SERVER=unix://${NFSX}/nfsserv rump_server -lrumpvfs -lrumpdev -lrumpnet \ -lrumpnet_net -lrumpnet_netinet -lrumpnet_local -lrumpnet_shmif \ -lrumpdev_disk -lrumpfs_ffs -lrumpfs_nfs -lrumpfs_nfsserver \ -d key=/dk,hostpath=${NFSX}/ffs.img,size=host ${RUMP_SERVER} [ $? -eq 0 ] || die rump server startup failed # configure server networking rump.ifconfig shmif0 create rump.ifconfig shmif0 linkstr ${NFSX}/shmbus rump.ifconfig shmif0 inet 10.1.1.1 # especially rpcbind has a nasty habit of looping export RUMPHIJACK_RETRYCONNECT=die export LD_PRELOAD=/usr/lib/librumphijack.so # "mtree" mkdir -p /rump/var/run mkdir -p /rump/var/db touch /rump/var/db/mountdtab mkdir /rump/etc mkdir /rump/export # create /etc/exports echo '/export -noresvport -noresvmnt -maproot=0:0 10.1.1.100' | \ dd of=/rump/etc/exports 2> /dev/null # mount our file system mount_ffs /dk /rump/export 2> /dev/null || die mount failed touch /rump/export/its_alive # start rpcbind. we want /var/run/rpcbind.sock RUMPHIJACK='blanket=/var/run,socket=all' rpcbind || die rpcbind start # ok, then we want mountd in the similar fashion RUMPHIJACK='blanket=/var/run:/var/db:/export,socket=all,path=/rump,vfs=all' \ mountd /rump/etc/exports || die mountd start # finally, it's time for the infamous nfsd to hit the stage RUMPHIJACK='blanket=/var/run,socket=all,vfs=all' nfsd -tu
#!/bin/sh # # This script starts a rump kernel which contains the drivers necessary # to mount an NFS export. It then proceeds to mount and provides # a directory listing of the mountpoint. # NFSX=/tmp/nfsx export RUMP_SERVER=unix://${NFSX}/nfscli rump.halt 2> /dev/null rump_server -lrumpvfs -lrumpnet -lrumpnet_net -lrumpnet_netinet \ -lrumpnet_shmif -lrumpfs_nfs ${RUMP_SERVER} rump.ifconfig shmif0 create rump.ifconfig shmif0 linkstr ${NFSX}/shmbus rump.ifconfig shmif0 inet 10.1.1.100 export LD_PRELOAD=/usr/lib/librumphijack.so mkdir /rump/mnt mount_nfs 10.1.1.1:/export /rump/mnt echo export RUMP_SERVER=unix://${NFSX}/nfscli echo export LD_PRELOAD=/usr/lib/librumphijack.so
To use the NFS server, just run both scripts. The client script will print configuration data, so you can eval the script's output in a bourne type shell for the correct configuration.
golem> sh rumpnfsd.sh golem> eval `sh rumpnfsclient.sh`
That's it. You can start a shell and access the NFS client as normal.
golem> df /rump/mnt Filesystem 1K-blocks Used Avail %Cap Mounted on 10.1.1.1:/export 4631 0 4399 0% /mnt golem> sh golem> cd /rump golem> jot 100000 > mnt/numbers golem> df mnt Filesystem 1K-blocks Used Avail %Cap Mounted on 10.1.1.1:/export 4631 580 3819 13% /mnt
When you're done, stop the servers in the normal fashion. You may
also want to remove the /tmp/nfsx
temporary
directory.
Further ideas
Kernel code development and debugging was a huge personal motivation for working on this, and is a truly excellent use case especially if you want to safely and easily learn about how various parts of the kernel work.
There are also more user-oriented applications. For example, you can construct servers which run hardware drivers from some later release of NetBSD than what is running on your host. You can also distribute these devices as services on the network.
On a multiuser machine where you do not have control over how your data is backed up you can use a cgd server to provide a file system with better confidentiality guarantees than your regular home directory. You can easily configure your applications to communicate directly with the cryptographic server, and confidential data will never hit the disk unencrypted. This, of course, does not protect against all threat models on a multiuser system, but is a simple way of protecting yourself against one of them.
Furthermore, you have more finegrained control over privileges. For example, opening a raw socket requires root privileges. This is still true for a rump server, but the difference is that it requires root privileges in the rump kernel, not the host kernel. Now, if rump server runs with normal user privileges (as is recommended), you cannot use rump kernel root privileges for full control of the hosting OS.
In the end, this document only scratched the surface of what is possible by running kernel code as services in userspace.