bhyvecon Ottawa 2019 - The BSD Hypervisor Conference
Speaker: Kamil Rytarowski (not the NVMM author!)
E-mail: kamil@netbsd.org
Date: May 14th, 2019
Location: Ottawa, Canada
Kamil Rytarowski (born 1987)
Krakow, Poland
NetBSD user since 6.1.
NetBSD Foundation member since 2015.
Work areas: kernel, userland, pkgsrc.
Interest: NetBSD on desktop and in particular NetBSD as a workstation.
Current activity in 3rd party software:
Author: Maxime Villard (maxv @ NetBSD org)
Maxime is an x86 and RISCV maintainer in NetBSD.
Maxime is the author of notable projects such as Kernel ASan and Kernel ASLR (and many others).
Quick characteristics:
NVMM is divided into two general parts:
I will pick the following definitions:
NVMM does not implement an emulator.
NVMM is suitable for existing emulators such as Qemu, VirtualBox, Android Studio or xhyve.
This makes NVMM similar to:
1 /usr/src/sys/dev/nvmm
2 |-- Makefile
3 |-- files.nvmm
4 |-- nvmm.c
5 |-- nvmm.h
6 |-- nvmm_internal.h
7 |-- nvmm_ioctl.h
8 `-- x86
9 |-- Makefile
10 |-- nvmm_x86.c
11 |-- nvmm_x86.h
12 |-- nvmm_x86_svm.c
13 |-- nvmm_x86_svmfunc.S
14 |-- nvmm_x86_vmx.c
15 `-- nvmm_x86_vmxfunc.S
16
17 1 directory, 13 files
1 $ wc -l * x86/*
2 1 CVS
3 13 Makefile
4 14 files.nvmm
5 1151 nvmm.c
6 129 nvmm.h
7 121 nvmm_internal.h
8 148 nvmm_ioctl.h
9 2 x86
10 1 x86/CVS
11 7 x86/Makefile
12 332 x86/nvmm_x86.c
13 269 x86/nvmm_x86.h
14 2371 x86/nvmm_x86_svm.c
15 200 x86/nvmm_x86_svmfunc.S
16 3157 x86/nvmm_x86_vmx.c
17 260 x86/nvmm_x86_vmxfunc.S
18 8176 total
1 /usr/src/lib/libnvmm
2 |-- Makefile
3 |-- libnvmm.3
4 |-- libnvmm.c
5 |-- libnvmm_x86.c
6 |-- nvmm.h
7 `-- shlib_version
8
9 0 directories, 6 files
1 $ wc -l *
2 1 CVS
3 15 Makefile
4 635 libnvmm.3
5 537 libnvmm.c
6 3218 libnvmm_x86.c
7 107 nvmm.h
8 5 shlib_version
9 4518 total
Pros:
Cons:
From an end-user perspective NVMM is the same in spirit as WHPX, HVF, HAXM, KVM (we are closer to WHPX than KVM).
All of these hardware acceleration APIs could be abstracted with a single API (example: https://github.com/StrikerX3/virt86 ).
In the end we build the same simulators on top of different hypervisors, however basic differences between these solutions are as follows:
1 #include <nvmm.h>
2
3 /* mach is an opaque structure */
4 struct nvmm_machine mach;
5
6 /* create the machine */
7 nvmm_machine_create(&mach);
8
9 /* create VCPU0 */
10 nvmm_cpuid_t vcpu = 0;
11 nvmm_vcpu_create(&mach, vcpu);
Virtual Machine is associated with its process and gets automatically destroyed on either normal or abnormal exit.
1 struct nvmm_x64_state state;
2 nvmm_cpuid_t vcpu = 0;
3
4 /* get general purpose registers */
5 nvmm_vcpu_getstate(&mach, vcpu, &state, NVMM_X64_STATE_GPRS);
6
7 /* modify the state */
8 state.gprs[NVMM_X64_GPR_RBX] = 0xabcdef;
9
10 /* set general purpose registers*/
11 nvmm_vcpu_setstate(&mach, vcpu, &state, NVMM_X64_STATE_GPRS);
Recent commit with explanation how we optimize performance of these calls by reducing the number of syscalls:
https://mail-index.netbsd.org/source-changes/2019/04/28/msg105538.html
There are two mappings:
This allows to access guest's memory as a regular buffer in an emulator.
1 size_t size = PAGE_SIZE;
2
3 /* allocate buffer */
4 uintptr_t hva = (uintptr_t)mmap(NULL, size,
5 PROT_READ|PROT_WRITE,
6 MAP_ANON|MAP_PRIVATE,
7 -1, 0);
8 gpaddr_t gpa = 0x3000;
9
10 /* prepare a buffer for mapping into GPA */
11 nvmm_hva_map(&mach, hva, size);
12
13 /* map HVA->GPA */
14 nvmm_gpa_map(&mach, hva, gpa, size, PROT_READ|PROT_WRITE);
1 struct nvmm_exit exit;
2 nvmm_cpuid_t vcpu = 0;
3
4 while (1) {
5 nvmm_vcpu_run(&mach, vcpu, &exit);
6 switch (exit.reason) {
7 case NVMM_EXIT_NONE:
8 break; /* nothing to do */
9 case ... /* several other reasons */
10 }
11 }
NVMM_EXIT_NONE can happen for various reasons, like emitted signal to an emulator.
Now we are able to write a bare bone emulator that performs calculations on registers inside a Virtual Machine.
1 static void
2 nvmm_io_callback(struct nvmm_mem *mem)
3 {
4 /* handle unhandled I/O access, route to emulated devices */
5 }
6
7 static void
8 nvmm_mem_callback(struct nvmm_io *io)
9 {
10 /* handle unhandled memory access, route to emulated devices */
11 }
12
13 static const struct nvmm_callbacks nvmm_callbacks = {
14 .io = nvmm_io_callback,
15 .mem = nvmm_mem_callback
16 };
17 /* Register unhandled operations for I/O and MEM assist */
18 nvmm_callbacks_register(&nvmm_callbacks);
19 /* Callbacks are global per application instance */
Update! Callback are no longer global per application in the most recent API changes.
1 struct nvmm_exit exit;
2 nvmm_cpuid_t vcpu = 0;
3
4 while (1) {
5 nvmm_vcpu_run(&mach, 0, &exit);
6 switch (exit.reason) {
7 case NVMM_EXIT_NONE:
8 break; /* nothing to do */
9 case NVMM_EXIT_IO:
10 nvmm_assist_io(&mach, vcpu, &exit);
11 break;
12 case NVMM_EXIT_MEM:
13 nvmm_assist_mem(&mach, vcpu, &exit);
14 break;
15 case ... /* several other reasons */
16 }
17 }
The API is documented in man-pages and contains an elaborated example code running a toy kernel in a toy emulator.
1 FILES
2 https://www.netbsd.org/~maxv/nvmm/nvmm-demo.zip
3 Functional example (demonstrator). Contains a virtualizer that
4 uses the libnvmm API, and a small kernel that exercises this
5 virtualizer.
6 src/sys/dev/nvmm/
7 Source code of the kernel NVMM driver.
8 src/lib/libnvmm/
9 Source code of the libnvmm library.
-- libnvmm(3)
Idea:
Possible research:
Why xhyve?
The FreeBSD booting code (userboot.so) has to be crossbuilt from a modified FreeBSD distribution running by the FreeBSD kernel - too much burden for a realistic solution.
This differs from Linux booting as Linux uses a well-defined booting protocol. The existing code is a standalone solution and does not require up to date Linux kernel headers, filesystem definitions of internals neither pregenerated files.
There is a similar direct booting issue in VMD for OpenBSD. If VMD will be ever ported to NetBSD or direct OpenBSD booting supported in an existing simulator, it will be done with bios/EFI booting.
Direct booting of NetBSD is easier to achieve, as we have direct access to up to date system headers. On the other hand booting is usually not a bottleneck and probably better to invest into proper MULTIBOOT/EFI/bios support.
The vmm_ops methods map cleanly into libnvmm(3) API.
1 struct vmm_ops vmm_ops_nvmm = {
2 vmx_init,
3 vmx_cleanup,
4 vmx_vm_init, /* nvmm_machine_create()... */
5 vmx_vcpu_init, /* nvmm_vcpu_create() */
6 vmx_run, /* nvmm_vcpu_run() */
7 vmx_vm_cleanup,
8 vmx_vcpu_cleanup,
9 vmx_getreg, /* nvmm_vcpu_getstate() */
10 vmx_setreg, /* nvmm_vcpu_setstate() */
11 vmx_getdesc, /* nvmm_vcpu_getstate() */
12 vmx_setdesc, /* nvmm_vcpu_setstate() */
13 vmx_getcap,
14 vmx_setcap,
15 vmx_vlapic_init,
16 vmx_vlapic_cleanup,
17 vmx_vcpu_interrupt
18 /* Planned new methods, for code executed with VM context:
19 * vmm_mem_init
20 * vmm_mem_alloc: valloc() + nvmm_hva_map() + nvmm_gpa_map()
21 * vmm_mem_free : nvmm_gpa_unmap() + nvmm_hva_unmap() */
22 };
Current status:
Issues with xhyve:
More ideas:
More projects for NVMM:
Project's home page:
The NetBSD Foundation blog post:
Table of Contents | t |
---|---|
Exposé | ESC |
Full screen slides | e |
Presenter View | p |
Source Files | s |
Slide Numbers | n |
Toggle screen blanking | b |
Show/hide slide context | c |
Notes | 2 |
Help | h |