Increasing coverage of signal semantics in regression tests

Kernel signal code is a complex maze, it's very difficult to introduce non-trivial changes without regressions. Over the past month I worked on covering missing elementary scanarios involving the ptrace(2) API. Part of the new tests were marked as expected to success, however a number of them are expected to fail.

The NetBSD distribution changes

I've also introduced non-ptrace(2) related changes namely from the domain of kernel sanitizers, kernel fixes and corresponding ATF tests. I won't discuss them forther as they were beyond the ptrace(2) scope. These changes were largely stimulated by students preparing for summer work as a part of Google Summer of Code.

The ptrace(2) ATF commits landed into the repository:

Trap signals processing without signal context reset

The current NetBSD kernel appraoch of processing crash signals (SEGV, FPE, BUS, ILL, TRAP) is to reset the context of signals. This behavior was introduced as an intermediate and partially legitimate fix for cases of masking a crash signal that was causing infinite loop in a dying process.

The expected behavior is to never reset signal context of a trap signal (or any other signal) when executed under a debugger. In order to achieve these semantics I've introduced a fix for this for the first time last year, but I had to revert quickly, as it caused side effect breakage, not covered by existing at that time ATF ptrace(2) regression tests. This time I made sure to cover upfront almost all interesting scenarios that are requested to function properly. Surprisingly after grabbing old faulty fix and improving it locally, the current signal maze code caused various side effects in corner cases, such as translating SIGKILL in certain tests to previous trap signal (like SIGSEGV).. In other cases side effect behavior seems to be probably even stranger, as one tests hangs only against a certain type of wait(2)-like function (waitid(2)), and executes without hangs against other wait(2)-like function types.

For the reference such surprises can be achieved with the following patch:

Index: sys/kern/kern_sig.c
===================================================================
RCS file: /cvsroot/src/sys/kern/kern_sig.c,v
retrieving revision 1.350
diff -u -r1.350 kern_sig.c
--- sys/kern/kern_sig.c	29 Nov 2018 10:27:36 -0000	1.350
+++ sys/kern/kern_sig.c	3 Mar 2019 19:26:54 -0000
@@ -911,13 +911,25 @@
 	KASSERT(!cpu_intr_p());
 	mutex_enter(proc_lock);
 	mutex_enter(p->p_lock);
+
+	if (ISSET(p->p_slflag, PSL_TRACED) &&
+	    !(p->p_pptr == p->p_opptr && ISSET(p->p_lflag, PL_PPWAIT))) {
+		p->p_xsig = signo;
+		p->p_sigctx.ps_faked = true; // XXX
+		p->p_sigctx.ps_info._signo = signo;
+		p->p_sigctx.ps_info._code = ksi->ksi_code;
+		sigswitch(0, signo, false);
+		// XXX ktrpoint(KTR_PSIG)
+		mutex_exit(p->p_lock);
+		return;
+	}
+
 	mask = &l->l_sigmask;
 	ps = p->p_sigacts;
 
-	const bool traced = (p->p_slflag & PSL_TRACED) != 0;
 	const bool caught = sigismember(&p->p_sigctx.ps_sigcatch, signo);
 	const bool masked = sigismember(mask, signo);
-	if (!traced && caught && !masked) {
+	if (caught && !masked) {
 		mutex_exit(proc_lock);
 		l->l_ru.ru_nsignals++;
 		kpsendsig(l, ksi, mask);

Such changes need proper investigation and addressing bugs that are now detectable easier with the extended test-suite.

Plan for the next milestone

Keep preparing kernel fixes and after thorough verification applying them to the mainline.

This work was sponsored by The NetBSD Foundation.

The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL to chip in what you can:

http://netbsd.org/donations/#how-to-donate