Struggling to fix a bohrbug in X86 Debug Registers

I've spent a month on fixes and debugging issues around the debugging facilities in the kernel.

Distribution cleanup

As planned in the previous month, I've performed cleanup in the distribution:

Improvements in the ATF ptrace(2) tests

I've performed the following operations in the ATF ptrace(2) and related tests, in the following commit order:

Zombie detection race

I've detected that the operation polling for a single process status - using sysctl(3) - can report the same process twice, for the first time as an alive process and for the second time as a zombie. This used to break the ATF ptrace(2) tests where we have a polling function, detecting the state change of a dying process.

I've prepared an applied a fix for this case and the bug detected in ATF ptrace(2) tests is now gone. I've included additional regression tests as noted below, to catch the race in a dedicated t_zombie test-suite.

There is also a controversy here, as it's not specified by POSIX what happens when we are polling an unrelated process, that is not a child of its parent. This operation can fail and we the previous approach we could report the same process twice, while with the newer one reporting twice is much more unlikely on the cost of missing the process at times.

My preferrence is that a process between the state transition of alive->dying->dead and dead->zombie, can disappear for a while. I find this metaphor more natural rather than observing two entities that one is dying and the other is a zombie. There is no perfect solution to this process watch and just following the narrow cases requested with POSIX seems to be the proper solution and it de facto assumes such races.

Bohrbug in X86 Debug Registers

The release engineering machines were observing occasional failures with the X86 Debug Registers in the ptrace(2) ATF test-suite. This was appearing once a while, approximately quarterly, on both amd64 and i386 machines.

I've missed them previously in my original work a year ago, as this happened to be caused by the fact that this bug is not reproducible on newer (quicker? more cores?) Intel CPUs. This failure was reproduced only in software-emulation (slow!) qemu and on Intel Core 2 Duo.

An attempt to execute the same X86 Debug Register test on Intel i7 for 100M times does not make this crash to pop op even a single time.

I've spent the majority of the past month on researching this bug and it happened to be a bohrbug, this means that adding almost any debug code anywhere makes it disappear completely or make reproducible siginificantly less often.

It's not worth the space to describe the process of understanding this bug more closely, step by step - it's worth to mention the current state of the understanding of mine. We can observe a repeated syscall for the _lwp_kill(2) operation (executed with raise(SIGSTOP)), with the same trap frame (or very similar). I still don't how is it possible.

I've finally added support to the NetBSD kernel - in my local copy of the sources - for Bochs-style debug console/protocol (ISA port 0xe9). This allows me to log internal qemu debug and NetBSD kernel messages into a single buffer and output to a single file. I need this property as matching two logging buffers, when the logs of just the the interrupt frames in qemu can quickly grow to hundreds of megabytes. Assuming that I've reproduced the bohrbug after 20k executions of the same test and there is a lot of noise from unrelated kernel/hardware traps, it's a useful property. I'm planning to cleanup this code and submit to the NetBSD sources.

For completeness I had to patch the qemu isa-debugcon (Bochs-style debug console) source code to log into the same buffer as the internal qemu debug messages... sometimes bugs require non-conventional approaches to research them. Ideally I would like to have a rewind/record feature in qemu together with the gdb-server emulation, but we are still not there.

ptrace(2) status and plans

Now there is the X86 Debug Register race. Also, the following tests still marked as expected failures:

This shows that the remaining major problems are in:

Once I will squash the X86 Debug Register race, I plan to work on these bugs in this oder: vfork handling bugs, signals and threads.

I have in mind addition of a lot of a lot of tests verifying correct signal processing in traced programs.

In this iteration of ptrace(2) hardening, I plan to skip Machine-Dependent extensions (like AVXv2 registers) and extending core(5) files with additional features (for example we might want to know exact ABI or FPU hardware layout).

Plan for the next milestone

Keep working on the bug reproducible with X86 Debug Registers and switch to the remaining ptrace(2) failures.

This work was sponsored by The NetBSD Foundation.

The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL, and chip in what you can:

http://netbsd.org/donations/#how-to-donate