2016-11-01

Today I started the first day of the first month of the funded project by The
NetBSD Foundation. The project has been announced on the TNF blog and it
describes steps towards getting functional LLDB, Swift (lang) and VirtualBox
on the NetBSD host.

Reference: http://blog.netbsd.org/tnf/entry/funded_contract_2016_2017

The first task is to get LLDB functional on the i386+amd64 NetBSD host. This is
rather heavy thing. I decomposited it to four milestones:

   1.  Add missing interfaces in ptrace(2), mostly sync it with
       the FreeBSD capabilities, add ATF tests, add documentation.

   2.  Develop process plugin in LLDB based on the FreeBSD code in
       LLDB and make it functional (start passing at least some tests).

   3.  Revamp the process plugin in LLDB for new remote debugging
       capabilities (in order to get it accepted and merged upstream),
       pass more tests.

   4.  LLDB x86 support, pass more of the standard LLDB tests
       and import LLDB to the NetBSD base. Add some ATF LLDB basic
       functionality tests to the tree. The original tests are
       unreliable and generate false positives.

The ptrace(2) calls are planned to involve:

 - PT_LWPINFO extend struct ptrace_lwpinfo with additional fields used
   in LLDB in the current FreeBSD process code (pl_flags, pl_child_pid,
   pl_siginfo),

 - PT_GETNUMLWPS - number of kernel threads associated with the traced
   process,

 - PT_GETLWPLIST - get the current thread list,

 - PT_SUSPEND - suspend the specified thread,

 - PT_RESUME - resume the specified thread.

Reference: http://mail-index.netbsd.org/tech-kern/2016/10/24/msg021198.html

I got to the point of discovering that there are undocumented ptrace(2) calls:
 - PT_SET_EVENT_MASK,
 - PT_GET_EVENT_MASK,
 - PT_GET_PROCESS_STATE.

They were added by Christos Zoulas in 2011 in order to save forker/forkee pid.

Reference: http://mail-index.netbsd.org/source-changes/2011/09/02/msg026817.html

These interfaces are apparently compatible with OpenBSD so I imported the needed
documentation from OpenBSD with giving credit to the source and making my first
commit with "Sponsored by <The NetBSD Foundation>" attribution.

Reference: http://mail-index.netbsd.org/source-changes/2016/11/01/msg078771.html

Addition of interfaces returning kernel threads associated with the traced
process seems the easiest task for now. I noted that there is an alternative
approach in OpenBSD with PT_GET_THREAD_FIRST PT_GET_THREAD_NEXT, which is
functionally equivalent with received results.

Their descriptions in appropriate documentation are as follows:

FreeBSD:

PT_GETNUMLWPS
        This request returns the number of kernel threads
        associated with the traced process.
PT_GETLWPLIST
        This request can be used to get the current thread list.  A
        pointer to an array of type lwpid_t should be passed in
        addr, with the array size specified by data.  The return
        value from ptrace() is the count of array entries filled
        in.

OpenBSD:

PT_GET_THREAD_FIRST
       This request reads the thread ID of the traced process' first
       thread into the "struct ptrace_thread_state" pointed to by addr.
       The data argument should be set to
       sizeof(struct ptrace_thread_state).
PT_GET_THREAD_NEXT
       This request is just like PT_GET_THREAD_FIRST, except it returns
       the thread ID of the subsequent thread. The "struct ptrace_thread_state"
       pointed to by addr must be initialized by a previous PT_GET_THREAD_FIRST
       or PT_GET_THREAD_NEXT request.

These functions are presented in gdb(1) sources in appropriate ?bsd-nat.c file.

http://cvsweb.netbsd.org/bsdweb.cgi/~checkout~/src/external/gpl3/gdb/dist/gdb/fbsd-nat.c?rev=1.1.1.6&content-type=text/plain
http://cvsweb.netbsd.org/bsdweb.cgi/~checkout~/src/external/gpl3/gdb/dist/gdb/obsd-nat.c?rev=1.1.1.2&content-type=text/plain

I sent a mail to tech-userlevel AT NetBSD.org and I moved to investigate the
PT_SUSPEND and PT_RESUME implementation in FreeBSD. It will be the next step
after implementing functionality for retrieving kernel threads, as it's
dedicated to suspend and resume those lwps.

As of this evening, the mail was filtered by spam-filters and is still waiting
for moderator's approval. I got feedback off-list from a developer that the
FreeBSD implementation is saner and less racy. Unless something happens till
tomorrow morning I will jump to its implementation. So far I have a local
diff with new manual ptrace(2) entries for them.

My helper tools are GitHub repos:
NetBSD: https://github.com/jsonn/src/
FreeBSD: https://github.com/freebsd/freebsd
OpenBSD: https://github.com/openbsd/src

I use GitHub for quick searching of keywords/symbols in the source code.

As an alternative, there is OpenGrok for BSDs under:
http://bxr.su

A fantastic reference for grepping open-source resources is hosted by Debian
people: https://codesearch.debian.net/

For the reference, so far PT_* keywords are located mostly in the GNU and LLVM
toolchains, with exceptions in some wrappers over native ptrace(2) interfaces,
recognizing BSDs. However, apart from the mentioned toolchains nothing popular
makes use of them.

2016-11-02

My initial plans to work on PT_GETNUMLWPS and PT_GETLWPLIST were interrupted
by Paul Goyette. He notified me that he plans to push his modularization to
the ptrace(2) subsystem in the kernel and that it may take a day or so. I
declared to switch to ATF tests for existing calls testing whether ptrace(2)
still functions as previously.

Apparently the code for ptrace(2) has been committed earlier than planned, as
Paul fixed issues in the code dependencies. I think there was something wrong
with procfs, ktrace(2), and ptrace(2) relation.

I took my hardcopy of the BSD4.4 Design and Implementation book to checkout
notes on ptrace(2). I found there an interesting information that procfs was
initially imported into BSD to address design defects (related to performance
bottlenecks). It was noted that future BSD implementations addressed the
shortcomings.

It’s worth to note that ptrace(2), as an interface for retrieving the state of
BSD kernel, tends not to be immune to race conditions. Usually we cannot get up
to date snapshots of the system state. This was one of the official reasons to
drop procfs in OpenBSD, as there were races to access data.

The racy aspect of the mentioned interfaces favors FreeBSD solution for
retrieving the list of kernel threads associated with the process. However, my
argument for them is that they are O(1) vs O(n) where n is the number
of LWPs.

Studying the documentation I noticed that FreeBSD-like PT_RESUME is already
available in NetBSD as PT_CONTINUE with the data parameter passed with a
negative value! Although the work for suspending and resuming single threads is
semifinished, there is still need for the suspension part.

Today I got to writing tests. I added new test called t_ptrace with two cases:
 - traceme1,
 - traceme2.

Basically they test the PT_TRACE_ME case with PT_CONTINUE with options to
resume execution where child process left off (addr is equal to (void*)1).
PT_CONTINUE may pass in the data argument signal to be passed to the traced
process. traceme1 sends no signals, whereas traceme2 passes SIGINT there.

Why did I choose these cases for the first tests? I wanted to add a simple
placeholder and keep adding new test-cases without need to synchronize
distrib/sets/lists later and just alter single file out of context to the other
distribution pieces.

I was also inspired by existing FreeBSD tests cases. As there is no point in
reinventing tests-cases conceptually from scratch, I looked at the following
files:

https://github.com/freebsd/freebsd/blob/master/tests/sys/kern/ptrace_test.c
https://github.com/freebsd/freebsd/blob/master/tools/test/ptrace/scescx.c

The scescx tests more advances cases, so I started with porting ptrace_test.c
as is to the NetBSD interfaces.

In the end I blacklisted... 19 out of 23 tests due to incompatibility and 1 out
of 4 worked out of the box:

$ atf-run  ./t_ptrace|atf-report
Tests root: /public/ptrace

./t_ptrace (1/1): 4 test cases
    ptrace__parent_sees_exit_after_child_debugger: sorry, pid 16686 was killed:
orphaned traced process
[0.002644s] Failed: t_ptrace.c:281: write(cpipe[0], &c, sizeof(c)) == sizeof(c)
not met
    ptrace__parent_sees_exit_after_unrelated_debugger: [0.001380s] Failed:
t_ptrace.c:392: read(dpipe[0], &c, sizeof(c)) == sizeof(c) not met
    ptrace__parent_wait_after_attach: sorry, pid 367 was killed: orphaned traced
process
[0.001097s] Failed: t_ptrace.c:97: WSTOPSIG(status) == SIGSTOP not met
    ptrace__parent_wait_after_trace_me: [0.001177s] Passed.
[0.006527s]

Failed test cases:
    ./t_ptrace:ptrace__parent_sees_exit_after_child_debugger,
./t_ptrace:ptrace__parent_sees_exit_after_unrelated_debugger,
    ./t_ptrace:ptrace__parent_wait_after_attach

Summary for 1 test programs:
    1 passed test cases.
    3 failed test cases.
    0 expected failed test cases.
    0 skipped test cases.

I took the ptrace__parent_wait_after_trace_me one and reimplemented for
NetBSD's ATF. In the LLDB world I will have much harder work to get the
machinery work almost in the fashion of all-or-nothing, so I should catch
FreeBSD specific flow one after another and get an idea how to reimplement it for
NetBSD.

The flow of this test is as follows:
................................................................................
     parent process                                child process

     fork(2)+
            |
            +---------------------------------->  ptrace(PT_TRACE_ME, ...) +
                                                                           |
                                                 +raise(SIGSTOP)  <--------+
                                                 |
 +--+waitpid(child, &status, ...) <--------------+
 |
 +-->ptrace(PT_CONTINUE, child, (void*)1, 0)+-----------+
                                                        v
                                                  _exit(exitval)
                                                      +
 +--+waitpid(chid, &status, ...) <--------------------+
 |
 +-->waitpid(child, &status, ...)
................................................................................

The steps are as follows:
 1. parent forks(2) and sleps on waitpid(2) waiting for child's status
 2. child calls ptrace(PT_TRACE_ME, ...) to get traced by its parent
 3. child raises the SIGSTOP call -- all signals are intercepted by parent
 4. parent intercepts event and sees that child is sleeping
 5. parent resumes its child with PT_CONTINUE and resets signal (data=0)
 6. child _exit(2)s -- it cannot use exit(3), as it would cleanup all resources
 7. parent intercepts event that child exited
 8. parent rechecks sanity that there is no longer child

For the future reference, here is the list of temporarily disabled FreeBSD
tests. It should give you the idea what has to be done:
 - ptrace__follow_fork_both_attached
 - ptrace__follow_fork_child_detached
 - ptrace__follow_fork_parent_detached
 - ptrace__follow_fork_both_attached_unrelated_debugger
 - ptrace__follow_fork_child_detached_unrelated_debugger
 - ptrace__follow_fork_parent_detached_unrelated_debugger
 - ptrace__getppid
 - ptrace__new_child_pl_syscall_code_fork
 - ptrace__new_child_pl_syscall_code_vfork
 - ptrace__new_child_pl_syscall_code_thread
 - ptrace__lwp_events
 - ptrace__lwp_events_exec
 - ptrace__siginfo
 - ptrace__ptrace_exec_disable
 - ptrace__ptrace_exec_enable
 - ptrace__event_mask
 - ptrace__ptrace_vfork
 - ptrace__ptrace_vfork_follow

So far my tests are verbose and code should be cleaned up. There are also two
XXX cases:

XXX: Is it safe to call ATF functions from a child? FreeBSD seems to
     construct dedicated asserts for them.

XXX: printf(3) calls from a child are not intercepted by atf-run(1)

I will try to contact Julio Merino in order to address them cleanly.

There are still almost 30 tests to be reimplemented for NetBSD. In the
initial plan I have time for the ptrace(2) part up to the end of this month, so
this time must be used properly. Synchronizing the interfaces in the order of
passing FreeBSD-like tests makes more sense than blindly copying their APIs.
With the tests it's possible to unveiled potential bugs and differences in
NetBSD. Like it's good that I have not started PT_RESUME work, as it should
already work through PT_CONTINUE.

Commits:
Add new test-case "traceme2" in t_ptrace
http://mail-index.netbsd.org/source-changes/2016/11/02/msg078827.html

Add new test t_ptrace with traceme1
http://mail-index.netbsd.org/source-changes/2016/11/02/msg078822.html

As a note for later, I would consider symlinking wait.2 with:
 - WIFEXITED.2,
 - WIFSIGNALED.2,
 - WIFSTOPPED.2,
 - WIFCONTINUED.2,
 - etc.

These macros are like function calls and it would be easier to find proper
man(1) documentation without help of apropos(1) (or other similar tools). These
are already links in OpenBSD.

In the same man-page there is also a construct with the following text:

     The wait6() function is the most general function in this family and its
     distinct features are:

     All of the desired process statuses to be waited on must be explicitly
     specified in options.  The wait(), waitpid(), wait3(), and wait4()
     functions all implicitly wait for exited and trapped processes, but the
     waitid() and wait6() functions require the corresponding WEXITED and
     WTRAPPED flags to be explicitly specified.  This allows waiting for
     processes which have experienced other status changes without having to
     also handle the exit status from terminated processes.

     The wait6() function accepts a wrusage argument which points to a
     structure defined as:

     struct wrusage {
             struct rusage   wru_self;
             struct rusage   wru_children;
     };

It's unclear whether the part after the first line is corresponding to it or
whether there is missing text. For now wait(2) isn't critical and can be
improved (or not) later.

2016-11-03

Today I added ptraceme3 and ptraceme4 in t_ptrace.

The ptraceme3 test verifies that SIGINT sent by a parent to its child through
ptrace(2) call with PT_CONTINUE. SIGINT has been choosen due to its default
propery of not generating useless core.

The ptraceme4 test-case raises SIGCONT in the child and it triggered actually
two errors! One of them was that status should be set to STOPPED, not CONTINUED
and the other that everything detected with WIFCONTINUED() also returns true
for WIFSTOPPED() because the logic for bits is as follows (sys/wait.h):

#if !( defined(_XOPEN_SOURCE) || defined(_NETBSD_SOURCE) ) || defined(_KERNEL)
#define _W_INT(i)       (i)
#else
#define _W_INT(w)       (*(int *)(void *)&(w))  /* convert union wait to int */
#endif

#define _WSTATUS(x)     (_W_INT(x) & 0177)
#define _WSTOPPED       0177            /* _WSTATUS if process is stopped */
#define _WCONTINUED     0xffffU
#define WIFSTOPPED(x)   (_WSTATUS(x) == _WSTOPPED)
#define WIFCONTINUED(x) (_W_INT(x) == _WCONTINUED)
#define WSTOPSIG(x)     ((int)(((unsigned int)_W_INT(x)) >> 8) & 0xff)
#define WIFSIGNALED(x)  (_WSTATUS(x) != _WSTOPPED && _WSTATUS(x) != 0)
#define WTERMSIG(x)     (_WSTATUS(x))
#define WIFEXITED(x)    (_WSTATUS(x) == 0)
#define WEXITSTATUS(x)  ((int)(((unsigned int)_W_INT(x)) >> 8) & 0xff)

It's apparent that WIFSTOPPED() checks a bit submask of WIFCONTINUED().

The waitpid(3) part triggering WIFCONTINUED inside the kernel has been fixed
by Christos Zoulas:

http://mail-index.netbsd.org/source-changes/2016/11/03/msg078852.html

This issue got its dedicated PR: kern/51596:

http://gnats.netbsd.org/51596

Once it will be tested by releng machines and verified to work, I will close it
and remove its corresponding atf_tc_expect_fail() entry.

Originally I was trying to emit with ptrace(2) a SIGCONT signal, however I
failed to get sane results and this is why I switched to raising it from the
child. I finally got WIFCONTINUED true... and I learnt later that it's a bug.
Having both WIFCONTINUED() and WIFSTOPPED() true was suspicious.

My mail to tech-userlevel@ has been finally passed to the mailing list and it
got a reply from Joerg Sonnenber and Christos Zoulas:

http://mail-index.netbsd.org/tech-userlevel/2016/11/03/msg010345.html

Joerg proposed:

struct ptrace_getinfos_request {
  size_t allocated_lwps;
  size_t current_lwps;
  struct ptrace_lwpinfo[];
}

Christos noted that it might not be needed as there might be already code
for it in libpthread_dbg (td_open() etc.).

I will reschedule it for future, once I will translate more tests from FreeBSD.
To save time, I will temporarily ignore WIFCONTINUED() and leave it for later.

Also I got feedback (before sending a mail to Julio Merino) from Martin
Husemann about issues pointed in ATF. In general printf(3) won't work from the
child as it would need to establish a communication with its parent and output
text through it. It's not worth the effort and it would overcomplicate the code
unnecessarily so the idea about printing messages both from the parent and its
child has been abandoned. Also due to potential issues with calling ATF
functions in a child, the consensus is to set a new wrapper with the
err(3)/errx(3) call.

Here is the code for the above:

/*
 * A child process cannot call atf functions and expect them to magically
 * work like in the parent.
 * The printf(3) messaging from a child will not work out of the box as well
 * without estabilishing a communication protocol with its parent. To not
 * overcomplicate the tests - do not log from a child and use err(3)/errx(3)
 * wrapped with FORKEE_ASSERT()/FORKEE_ASSERTX() as that is guaranteed to work.
 */
#define FORKEE_ASSERTX(x)                                                     \
do {                                                                          \
        int ret = (x);                                                        \
        if (!ret)                                                             \
                errx(EXIT_FAILURE, "%s:%d %s(): Assertion failed for: %s",    \
                     __FILE__, __LINE__, __func__, #x);                       \
} while (0)
#define FORKEE_ASSERT(x)                                                      \
do {                                                                          \
        int ret = (x);                                                        \
        if (!ret)                                                             \
                err(EXIT_FAILURE, "%s:%d %s(): Assertion failed for: %s",     \
                     __FILE__, __LINE__, __func__, #x);                       \
} while (0)

With the above we are removing some estra messages, like:

- if (traceme2_catched != 1)
-   errx(EXIT_FAILURE, "2: signal handler not called? "
-       "traceme2_catched (equal to %d) != 1",
-       traceme2_catched);
+ FORKEE_ASSERTX(traceme2_caught == 1);

NB. The above code also fixes spelling of e past tense and past participle of
catch. For penance I fixed all catched occurences in the src/ code.

The messages will be now less verbose and a developer might need to run it with
a debugger to get more needed information... however on the other hand this
test will remain there for the sake of catching regressions.

I also finally restructured the code for test bodies, from:

        if (child == 0) {
            // child's code here
        } else {
            // parent's code here
        }

to:

        if (child == 0) {
            // child's code here
        }
        // parent's code here

The child is destinated to be terminted/exited anyway and never run out of the
closing bracket. I took the initial part consciently hoping that it won't be
less readable, however commands and especially verbose messages tend to be
longer and the necessity to wrap them down convinced me to restructure that
code.

Robert Elz asked me to replace sys_errlist(3) with strerror(3) and enlightened
that in case of custom or newer kernel we can receive unrecognized errno(3)
that is out of the sys_errlist[] array. strerror(3) already handles this
use-case and is safer.

Commits from today:
http://mail-index.netbsd.org/source-changes/2016/11/03/msg078837.html
http://mail-index.netbsd.org/source-changes/2016/11/03/msg078847.html
http://mail-index.netbsd.org/source-changes/2016/11/03/msg078849.html
http://mail-index.netbsd.org/source-changes/2016/11/03/msg078850.html
http://mail-index.netbsd.org/source-changes/2016/11/03/msg078851.html
http://mail-index.netbsd.org/source-changes/2016/11/03/msg078853.html

I feel like I need to speed up my efforts with porting FreeBSD tests for
ptrace(2).

Short summary after the third day:
 - documented missing calls in ptrace(2)
 - added 4 tests in t_ptrace
 - detected two NetBSD issues
 - removed need for PT_RESUME in ptrace(2)
 - perhaps removed need for retrieving a list of LWPs
 - adapted 1 FreeBSD test for NetBSD

TODO:
 - fix WIFCONTINUED() issue
 - port remaining FreeBSD tests
 - investigate & implement if needed PT_SUSPEND
 - investigate retrieving of LWPs for ptrace(2)

TODO of things that might not be done:
 - add support for debug registers in CPU

2016-11-04

It was a long day. The traceme4 expected ATF failure has been removed from the
test, the code now functions appropriately and reports success on testservers.

Today, I took the ptrace__parent_sees_exit_after_child_debugger test and I
tried to adapt it to NetBSD. I faced that IPC with pipe(2) was totally wrong
and I wonder how it might work at all. Right now I don't have access to FreeBSD
and I was unable to test it there. First I need to finish VirtualBox port in
order to get useful virtualization on NetBSD.. but it's a topic for another
month.

What was wrong in the FreeBSD tests? The pipe(2) call creates two file
descriptors: input (index 0) and output (index 1). All read operations must be
done through [0] and write ones through the [1] index. The original code was
freely reading and writing from any of the descriptors available.. this
resulted with broken code on NetBSD and Linux. Additionally once I finally
sorted the thing up and got it to work on Linux I experienced.. kernel panic
on NetBSD. It was effected by reparenting of a process between a tracer and a
parent. Thankfully, I got a lot of help from Christos Zoulas - I shared my code
to trigger the panic and he fixed the kernel for it with the following commit:

http://mail-index.netbsd.org/source-changes/2016/11/04/msg078868.html

I was struggling many hours with the IPC issues and finally narrowed the things
down to the conclusion that NetBSD does not guarantee tracer to see termination
of a tracee before the parent of the tracee. I filed a PR for it: kern/51600.

I'm also discussing off-list about proper WIFCONTINUED() solution. I declared
to add tests for this macro, once the IPC & ptrace(2) issues will be addressed.
It's now done, so tomorrow, I will produce tests for WIFCONTINUED() and they
will help to fix properly this interface.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/11/04/msg078859.html
http://mail-index.netbsd.org/source-changes/2016/11/05/msg078888.html

2016-11-05

The IFCONTINUED() issue has been fixed. It got PR standards/51603.
Christos Zoulas committed the fix changing macros in <sys/wait.h> to new
conditions:

@@ -58,10 +58,10 @@
 #define  _WSTATUS(x)	(_W_INT(x) & 0177)
 #define  _WSTOPPED	0177	     /* _WSTATUS if process is stopped */
 #define _WCONTINUED	0xffffU
-#define WIFSTOPPED(x)	(_WSTATUS(x) == _WSTOPPED)
+#define WIFSTOPPED(x)	(_WSTATUS(x) == _WSTOPPED && !WIFCONTINUED(x))
 #define WIFCONTINUED(x)	     (_W_INT(x) == _WCONTINUED)
 #define WSTOPSIG(x)		     ((int)(((unsigned int)_W_INT(x)) >> 8) & 0xff)
-#define WIFSIGNALED(x)		     (_WSTATUS(x) != _WSTOPPED && _WSTATUS(x) != 0)
+#define WIFSIGNALED(x)		     (!WIFSTOPPED(x) && !WIFCONTINUED(x) && !WIFEXITED(x))
 #define WTERMSIG(x)		     (_WSTATUS(x))
 #define WIFEXITED(x)		     (_WSTATUS(x) == 0)
 #define WEXITSTATUS(x)		     ((int)(((unsigned int)_W_INT(x)) >> 8) & 0xff)

http://mail-index.netbsd.org/source-changes/2016/11/05/msg078903.html

The issue of kern/51600 has been fixed as well by Christos Zoulas.
http://mail-index.netbsd.org/source-changes/2016/11/05/msg078889.html

This fixed all issues in t_ptrace.

I was trying to find interesting tests for WIFCONTINUED(), however everything I
got was already in our lib/libc/sys/t_wait. I just added there a new test
inspired by regress/sys/kern/sig-stop/sig-stop.c from OpenBSD, this program
calls emits in a loop multiple times SIGSTOP and SIGCONT to a child.

Short summary is that there are no longer known issues. It was a day of
clean ups.

Plan for tomorrow is to resume porting FreeBSD tests, starting with:
 - ptrace__parent_sees_exit_after_child_debugger,
 - ptrace__parent_sees_exit_after_unrelated_debugger,
 - ptrace__parent_wait_after_attach

These ones are closely related to fixed bugs and it's time to validate whether
there are still any problems.

My impression is that there are way more bugs than I expected.. and everything
that was triggered so far with so few tests. Thanks to help from Christos
Zoulas the project is still possible to be shipped timely. Porting the code,
triagging the problems and later debugging and fixing the kernel & headers
would be undoable in the given time frame. With assumption that there are many
more issues to face. I'm still far before implementing any new features.

Another strong impression is that porting tests from FreeBSD is way further
than few #ifdefs for silly things like the nitems() macro.

Commits:
http://mail-index.netbsd.org/source-changes/2016/11/05/msg078888.html
http://mail-index.netbsd.org/source-changes/2016/11/05/msg078891.html
http://mail-index.netbsd.org/source-changes/2016/11/05/msg078899.html
http://mail-index.netbsd.org/source-changes/2016/11/06/msg078905.html

2016-11-06

It was a long day. It really started in the project with a report by Nicolas
Joly that wait(2)-like functions with no children and WNOHANG option do not
error.

It's apparently a regression after fixing the issue for seeing process
termination by debugger before the child's parent.

This issue reorganized plans for today and in the end I was working on adding
new tests to verify what happens to all functions from the wait(2) family
(wait, waitpid, waitid, wait3, wait4, wait6) when a process does not have
children (t_wait_noproc). Another test (t_wait_noproc_wnohang) does the same,
but with WNOHANG option specified. These tests obsoleted a use-case from t_wait
called t_wait_noproc - testing just the wait6(2) interface and without WNOHANG.

These issues convinced me to add a matrix of wait(2) tests in t_ptrace. I
decided to add new tests there, which don't make use of wait(2) and extract the
others to t_ptrace_wait & add sibling files for other wait functions, reusing
the same code-base for waitpid(2), waitid(2) etc - smilar to t_mutex_lock and
its sibling t_mutex_timedlock. Namely now t_ptrace_wait starts in the header
with the following code:

[...]
/* Detect plain wait(2) use-case */
#if !defined(TPTRACE_WAITPID) && \
    !defined(TPTRACE_WAITID) && \
    !defined(TPTRACE_WAIT3) && \
    !defined(TPTRACE_WAIT4) && \
    !defined(TPTRACE_WAIT6)
#define TPTRACE_WAIT
#endif

[...]

#if defined(TPTRACE_WAIT)
#       define TPTRACE_FNAME                    "wait"
#       define TPTRACE_WAIT4TYPE(a,b,c,d)       wait((b))
#       define TPTRACE_WAITGENERIC(a,b,c)       wait((b))
#elif defined(TPTRACE_WAITPID)
#       define TPTRACE_FNAME                    "waitpid"
#       define TPTRACE_WAIT4TYPE(a,b,c,d)       waitpid((a),(b),(c))
#       define TPTRACE_WAITGENERIC(a,b,c)       waitpid((a),(b),(c))
#       define TPTRACE_WAITGENERIC_PID(a,b,c)   TPTRACE_WAITGENERIC
#elif defined(TPTRACE_WAITID)
#       define TPTRACE_FNAME                    "waitid"
#       define TPTRACE_WAIT6TYPE(a,b,c,d,e,f)   waitid((a),(b),(f),(d))
#elif defined(TPTRACE_WAIT3)
#       define TPTRACE_FNAME                    "wait3"
#       define TPTRACE_WAIT4TYPE(a,b,c,d)       wait3((b),(c),(d))
#       define TPTRACE_WAITGENERIC(a,b,c)       wait3((b),(c), NULL)
#elif defined(TPTRACE_WAIT4)
#       define TPTRACE_FNAME                    "wait4"
#       define TPTRACE_WAIT4TYPE(a,b,c,d)       wait4((a), (b),(c),(d))
#       define TPTRACE_WAITGENERIC(a,b,c)       wait4((a),(b),(c), NULL)
#       define TPTRACE_WAITGENERIC_PID(a,b,c)   TPTRACE_WAITGENERIC
#elif defined(TPTRACE_WAIT6)
#       define TPTRACE_FNAME                    "wait6"
#       define TPTRACE_WAIT6TYPE(a,b,c,d,e,f)   wait6((a),(b),(c),(d),(e),(f))
#       define TPTRACE_WAITGENERIC(a,b,c)       \
                wait6(P_PID,(a),(b),(c)|WEXITED|WTRAPPED,NULL,NULL)
#       define TPTRACE_WAITGENERIC_PID(a,b,c)   TPTRACE_WAITGENERIC
#endif

[...]

And the other files like t_ptrace_wait3.c are limited to two lines of code
like:

#define TPTRACE_WAIT3
#include "t_ptrace_wait.c"

Unfortunately, I failed to finish a case for waitd(2) tests tonight and I'm
rescheduling it to tomorrow. Also, I don't want to commit new larger code at
late night. It can always break something and interrupt releng machines.

I was thinking for a while whether there are any possible tests for ptrace(2)
without wait(2) functions.. finally I got an idea that I can test failures
of PT_ATTACH (and maybe PT_TRACE_ME). I added two tests: attach_pid0 and
attach_pid1. Nobody can attach to process ID 0 (it's a kernel thread) and a
standard user cannot attach to PID 1 (/sbin/init). Initially, I disabled the
attach_pid1 test for root user marking it to be skipped by ATF and Nicolas Joly
informed me that there is a possibility to mark it as unprivileged only with
the following code:

atf_tc_set_md_var(tc, "require.user", "unprivileged");

At the end of the day, I committed a correction to t_wait_noproc(_wnohang) and
t_ptrace.

Robert Elz suggested further extensions to t_wait, I have not started it
finally today.

So yeah, I want to get out of the wait(2) & ptrace(2) trap as soon as possible
and sort out all the bugs with a coverage as large as doable quickly, in order
to move on to new ptrace(2) scenarios. It's slowly turning on into neverending
story.. but it's a must to be done before going forwards, otherwise if it would
bite from LLDB code it would be much heavier to debug (multiple threads, C++,
enormous code-base, no initial NetBSD support...).

Commits today:
http://mail-index.netbsd.org/source-changes/2016/11/06/msg078911.html
http://mail-index.netbsd.org/source-changes/2016/11/06/msg078912.html
http://mail-index.netbsd.org/source-changes/2016/11/06/msg078913.html
http://mail-index.netbsd.org/source-changes/2016/11/06/msg078917.html
http://mail-index.netbsd.org/source-changes/2016/11/07/msg078924.html

2016-11-07

Today, I feel a little bit, it was good to hold on with committing new code to
the ATF framework. I took the code for t_ptrace_wait, refactored, cleaned up
and finally committed.

I tried to make the message verbose:


Add new tests for combination of wait(2) interfaces with ptrace(2)

Move out wait(2) specific tests from t_ptrace and put them to t_ptrace_wait

Add generic code fragments to reuse the same source-code for every member
of the wait(2) family, namely:
 - wait(2)
 - waitpid(2)
 - waitid(2)
 - wait3(2)
 - wait4(2)
 - wait6(2)

Currently in the new test-suite there are the following tests:
 - traceme1
 - traceme2
 - traceme3
 - traceme4
 - attach1

Not all tests are possible to be executed against every wait(2)-like
interface, therefore they will be disabled in such case. Currently this
limits attach1 to waitpid(2), waitid(2), wait4(2), wait6(2), while the
other tests (traceme 1-4) run with all of the interfaces.

The construct of this file is dedicated for addition of new tests in the
close future.

As of now all of the tests pass correctly.

Thanks for Robert Elz for suggestions on improving the code (earlier draft
of this new form).

Sponsored by <The NetBSD Foundation>.


If I am not mistaken, the number of ATF tests for wait(2)/ptrace(2) is now
larger by.. 23 new ones.

Plan for tomorrow is to add requested by Robert Elz and planned by myself tests
to t_wait and t_ptrace, and finally move on to new tests modeled after FreeBSD
test-suite.

I asked for 2 days to finish this before making decisions how to fix WNOHANG
regression in wait(2)-like functions.

Today I learnt that fork(2) in ATF should be used with a dedicated function
atf_utils_fork(). I started to use more ATF functions described in
atf-c-api(3) for the existing tests.

I've mailed jmmv@ finally asking whether there is really need to not use ATF
code in children of a process traced by ATF test-suite... and he acknowledged
it. I really dislike it.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/11/07/msg078936.html

2016-11-08

It was rather a busy day with catching up after needed and still missing tests.
I just committed two tests in t_ptrace for asserting that it's not valid to
attach (PT_ATTACH) to self and another one to assert that a debugger cannot
trace another process unless the process's root directory is at or below the
tracing process's root.

Today Robert Elz committed a patch to kernel:
http://mail-index.netbsd.org/source-changes/2016/11/09/msg078953.html

With the following text:
Modified Files:
        src/sys/kern: kern_exit.c

Log Message:
PR kern/51600 ; PR standards/51606

Revert 1.264 - that was intended to fix 51600, but didn't, it just
hid the problem, and caused 51606.  This fixes 51606.

Handle waiting on a process that has been detatched from its parent
because of being ptrace'd by some other process.  This fixes 51600.
("handle" here means that the wait() hangs, or with WNOHANG, returns 0,
we cannot actually wait on a process that is not currently an attached
child.)

Note: the detatched process waiting is not yet perfect (it fails to
take account of options like WALLSIG and WALTSIG) - suport for those
(that is, ignoring a detatched child that one of those options will
later cause to be ignored when the process is re-attached.)

For now, for ither than when waiting for a specific process ID, when
a process does a wait() sys call (any of them), has no applicable
children attached that can be returned, and has at least one detatched
child, then we do a linear search of all processes to look for a
suitable detatched child.  This is likely to be slow - but very rare.
Eventually it might be better to keep a list of detatched children
per process.


I'm working on new tests for t_wait_* to test valid combinations of the option
argument in wait(2)-like functions. I wrote the code for it but there are still
some issues and I'm trying to squash them, I will reschedule finishing it for
tomorrow.

I also verified that the commit by Robert fixed the issues for WNOHANG. Once
t_wait_noproc will be finished, I will commit there a change adding new tests
and removing old functions marking them as broken.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/11/08/msg078950.html
http://mail-index.netbsd.org/source-changes/2016/11/08/msg078951.html
http://mail-index.netbsd.org/source-changes/2016/11/08/msg078952.html

2016-11-09

I finished additional tests for wait(2) namely in the file t_wait_noproc. Here
is my commit message to explain it all:

Add new tests in t_wait_noproc and t_wait_noproc to test more options types

Add new tests:
 - waitpid_options
 - waitid_options
 - wait3_options
 - wait4_options
 - wait6_options

These tests are included in t_wait_noproc and t_wait_noproc_wnohang.

waitpid_options, wait3_options, wait4_options test combinations of options
of: bit for WALLSIG, WALTSIG, __WALL, __WCLONE and later a full combination
mask of WNOWAIT, WEXITED, WUNTRACED, WSTOPPED, WTRAPPED and WCONTINUED.

waitid and wait6 test full combination mask of WNOWAIT, WEXITED, WUNTRACED,
WSTOPPED, WTRAPPED and WCONTINUED -- excluded empty value and singular
WNOWAIT.

For compatibility reasons alter waitid and wait6 to test against options
WEXITED | WTRAPPED, as it's equivalent to waitpid, wait3, wait4.

The intention for these tests it to catch any possible issues with slighty
changed behavior of wait(2)-like functions in terms of valid options
values.

All tests pass successfully.

Sponsored by <The NetBSD Foundation>

Commits today:
http://mail-index.netbsd.org/source-changes/2016/11/09/msg078963.html

2016-11-10

Today I have committed one test to t_ptrace_wait with the following message:

Add new test attach2 in t_ptrace_wait{4,6,id,pid}

This test asserts that any tracer sees process termination before its
parent.

This test is not applicable for wait(2) and wait(3) as these interfaces
cannot get specified process id argument (PID).

Sponsored by <The NetBSD Foundation>.


As noted by Robert Elz, I mistyped wait3(2) as wait(3).

Then I moved on to new test-case. I want to attach a child from parent with
PT_ATTACH & PT_CONTINUE. However I faced the following issues:
- PT_ATTACH seems to work, but waiting for stopped status and signal from the
  child results in getting SIGTRAP, not SIGSTOP like in Linux and FreeBSD. This
  might be by design, I'm unsure. However, so far I was getting SIGSTOP from a
  tracer process that was not the parent.
- PT_CONTINUE seems to have no effect at all, the child hangs. This operation
  works on Linux and FreeBSD and in the end, test passes correctly.
- Debugging this with gdb(1) results in receiving SIGABRT from the GNU
  debugger. This is making the things harder in general.

I plan to submit tomorrow PR for it, cleanup the test and commit it as is and
move on to other tests.

For now I don't want to distract to non-test affairs. I want to write/port as
many tests as possible and collect a list of all issues upfront, before writing
any new code - unless other devlopers will help out.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/11/10/msg078978.html

2016-11-11

I have finally pushed attach3 in the t_ptrace_wait family of tests.

Lately releng machines are plagued with filesystem issues braking tests. These
issues are due to work of Jaromir Dolecek on WAPBL enhancements. I would like
to track ptrace(2)-related test results on releng machines.. but it is getting
hard.

I have filed a PR report:
PT_ATTACH from a parent is unreliable
http://gnats.netbsd.org/51621

Christos pushed fixes for it.. but I cannot reproduce this test as solved.
There is also a regression with attach1 and attach2.

I worked also on a version of a child attaching to its parent and tracing it. I
want to push it tomorrow to the sources. It was suggested by Paul Goyette.

I want to commit the suggested test tomorrow by Paul and another one requested
by Matt Green (to test root attaching to pid 1 with securelevel >= 1).

Short summary. There are 4 tests ported from the basic FreeBSD test-suite, out
of 23. I think I'm on 10% of the goal for this month after 37% of the time
passed. Debugging interfaces apparently are like a can of worms, perhaps three
months would be proper to nail all the issues down.

Part of new tests seem to be relatively simple, however I'm sure there will be
more issues out there. Especially that single test can unveil three different
issues out there (even if a solution is in a single line of code).

Good point is that many new tests will be extensions of the existing ones, so
in the number of tests being ported I can speed up. There are 7 tests verifying
relations between parent and tracer of a tracee, the other ones are for other
capabilities originally planned as the goal for this month.

I'm absorbed by writing tests, comparing with Linux and FreeBSD, understanding
behavior etc - it everything needs diligence.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/11/11/msg079006.html

2016-11-12

It was a long day. I have added 2 new tests, added (and soon closed) 1 new PR
and closed an older PR.

I have added a test for PT_ATTACH parent from a child, it worked flawlessly.
Another test was for verifying parent's process id - whether it's changed after
reparenting from parent to a debugger - this was broken as getppid(2) returned
debugger. It was soon addressed by Christos Zoulas. Later Robert Elz suggested
to improve this test and I scheduled to add two new test-cases:
 - to check parent's pid with sysctl(7) and checking kinfo_proc2
 - to check parent's pid with procfs (/proc/curproc/stat - 4th value)

The attach3 test has been finally fixed, Christos noted that there was a race
and write(2) cannot be used to block, read(2) is needed. In the end there was
planned to add new family of helper functions to facilitate IPC with pipe(2),
namely parent_handshake() and others. I've refactored the code for attach1 for
these new functions, however I'm still struggling with some bug in the code.
Once it will start to function correctly, I will refactor all the other attach*
ones.

I have also finished all the tests in the attach* group (modulo new ones for
getppid(2) variations). I started to investigate tests in FreeBSD for fork(2)
functions. There is now FreeBSD specific mechanism to handle tracing new
children from a traced process. It's currently under investigation. So far, I
have not seen similar functions used in LLDB - so maybe after adapting basic
fork(2) and later vfork(2) tests, I can move on to other interfaces.

I plan tomorrow to finish refactoring of t_ptrace_wait.c for handshake helper
functions, add test for securelevel >= 1 and attaching to pid1, add new two
tests for getppid(2) functions. Perhaps not too much, but I will be able to
start new week (Monday) with clean situation for fork(2) verification and
without debt of other tests.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/11/12/msg079021.html
http://mail-index.netbsd.org/source-changes/2016/11/12/msg079022.html
http://mail-index.netbsd.org/source-changes/2016/11/12/msg079027.html

2016-11-13

Today, again, there were wasted some hours to fix pipe(2)-based semaphores
after my refactoring, it failed and I removed my local changes from the tree.

Today Christs refactored getppid() issue in tracee (to point to real parent).

I added new tests attach6 and attach7 with the following commit:

Add new tests attach6 and attach7 in t_wait_proc{4,6,id,pid}

attach6:
    Assert that tracer sees its parent when attached to tracer (check
    sysctl(7) and struct kinfo_proc2)

attach7:
    Assert that tracer sees its parent when attached to tracer (check
    /proc/curproc/status 3rd column).

Currently these tests fail as getppid() and parent id obtained in these
alternative ways differ.

Sponsored by <The NetBSD Foundation>


I have also fixed a pipe(2) IPC race in attach5 and propagated it to attach6
and attach7.


Finally there is new test for securelevel:

Add attach_pid1_securelevel in t_ptrace

Assert that a debugger cannot attach to PID 1 with securelevel >= 1 (as root).

Test requested by <mrg>

Sponsored by <The NetBSD Foundation>

http://mail-index.netbsd.org/source-changes/2016/11/13/msg079054.html
http://mail-index.netbsd.org/source-changes/2016/11/14/msg079055.html
http://mail-index.netbsd.org/source-changes/2016/11/14/msg079057.html

2016-11-14

No new issues unveiled so I was researching fork(2). I noted that the
ptrace_lwpinfo structure (acquirable with PT_LWPINFO) in FreeBSD is very
extensive. The NetBSD version covers two structure members:
 - pl_lwpid - lwip identifier
 - pl_event - event that stopped lwp

pl_event can cover two values PL_EVENT_NONE and PL_EVENT_SIGNAL.

PT_LWPINFO is dedicated for a thread. For a process according to my
understanding there is PT_GET_PROCESS_STATE.

FreeBSD puts everything about thread into ptrace_lwpinfo. At the moment I don't
see a reason why not resue native <lwp.h> and <sys/lwp.h> and get mostly the
same result.

I'm not sure whether PT_SYSCALL and PT_SYSCALLEMU are duplicated interface for
EVENT_MASK or not.

I might need to add VFORK type in EVENT_MASK next to FORK.

I've added two basic tests eventmask1 and eventmask2 in t_ptrace_wait*, they
just assert that value stored with PT_SET_EVENT_MASK and restored with
PT_GET_EVENT_MASK is the same.

My plan for tomorrow is to finish a test where tracee forks and the event is
intercepted by a debugger, perhaps both for forker and forkee - so twice.

The good news is that EVENT_MASK might work, at least it can preserve its
value.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/11/15/msg079072.html
http://mail-index.netbsd.org/source-changes/2016/11/15/msg079073.html

2016-11-15

It was a successful day. I think the most successful and satisfying so far.

I committed two tests for fork(2) and additional sibling ones for vfork(2).
Their messages are as follows:

Add new test fork1 in t_ptrace_wait{4,6,id,pid}

Verify that fork(2) is intercepted by ptrace(2) with EVENT_MASK set to
PTRACE_FORK.

In this test tracee calls fork(2) and this event is noted by tracer, both
for forker and forkee with PT_GET_PROCESS_STATE reporting pe_report_event
equal to PTRACE_FORK and pe_other_pid as forkee for forker and forker for
forkee.

The fork(2) event in the current implementation stops forker and forkee
with the SIGTRAP signal.

Exited forkee stops forker with the SIGCHLD signal.

Sponsored by <The NetBSD Foundation>.


Add new test fork2 in t_ptrace_wait*

Verify that fork(2) is not intercepted by ptrace(2) with empty EVENT_MASK.

This test works with all wait(2)-like functions.

Debugger receives only SIGCHLD from tracee, when its the child of tracee
exits. Tracer notes nothing about fork(2) events.

Sponsored by <The NetBSD Foundation>


Add vfork1 test in t_ptrace_wait* and vfork2 in t_ptrace_wait{4,6,id,pid}

These tests are exact clones for fork1 and fork2, however testing vfork(2).

vfork1:
    Verify that vfork(2) is intercepted by ptrace(2) with EVENT_MASK set to
    PTRACE_VFORK.

vfork2:
    Verify that vfork(2) is not intercepted by ptrace(2) with empty
    EVENT_MASK.

vfork1 is supposed to test currently unimplemented PTRACE_VFORK option in
EVENT_MASK, marked as failure and linked with PR kern/51630.

Sponsored by <The NetBSD Foundation>


Today I also noted that PT_LWPINFO can already iterate over list of threads
- pl_lwpid is used to lookup next thread! This is enlightening and it
eliminates the need for new API. FreeBSD version of PT_LWPINFO reurns thread
that interrupted the process, on NetBSD it just returns one thread after
another. I also looked at libpthread_dbg used by gdb(1) as noted by Christos
Zoulas. This API was used to debug libpthreads with M:N model (Scheduler
Activation times) but it should still be fully functional.

I also got big progress on porting/adapting FreeBSD tests. It happened that
fork(2) ones aren't really useful or applicable for NetBSD. These tests mostly
verify things already checked with attach* and traceme* family, just with
additional call for fork(2) - for example checking debugger that it is not a
parent. Readding new tests that are doing the same things as the already
modeled is waste of time, of course it might still be useful to verify that in
some circumstances of fork(2) + unreleated debugger and PT_ATTACH and three
children corrupts memory or something... but it's rather a task for syscall
fuzzer rather than writing every possible application with combination of ATF
functions... it's not really needed. So finally. 18 out of 23 basic FreeBSD
tests are adapted for NetBSD or skipped as not applicable.

I'm still evaluating whether it's useful to add more events to be intercepted,
besides the fact that everything can be already tracked with syscall enter and
exit notification. FreeBSD has special calls to track only syscall exits or
only syscall enters.. I think it's pointless as there is already DTrace for it.
There are special events for LWP creation and exit.. is it needed? For now I
just requested event for vfork(2).

My concept today is to add PT_SUSPEND to stop process (pid if positive value
passed) or stop lwp (if negative value added), and add new type in pl_event
(of ptrace_lwpinfo) PL_EVENT_TRACER to inform that thread was suspended by a
tracer. I will give it more thought later, in general we need to handle
suspension as other features aren't so crucial.

The remaining work is to be done for:
 - thread/process suspension design and implementation
 - vfork(2) event implementation
 - debug registers implementation (at least for amd64 for now)
... anything else?

It's half of the time frame for this project so I will use some time now for
the current pthread-based implementation for threads used in gdb(1). There is
needed understanding its purpose, how it works and later tests + documentation.

Today, I was also wondering about adding examples illustrating how to use
ptrace(2) on NetBSD with examples binded to ./build.sh infrastructure to keep
the code building and prevent from quick rotting. But it's still rather in
plans only if I will finish the more urgent tasks. It's also visible that in
order to do something extra.. there is need to do the main work first and
quickly.

To quote my original mail to tech-kern@:

<start quote>
For start, before switching to process plugin stage is to extend NetBSD
ptrace(2) with the following features:

 - PT_LWPINFO extend struct ptrace_lwpinfo with additional fields used
in LLDB in the current FreeBSD process code (pl_flags, pl_child_pid,
pl_siginfo),

 - PT_GETNUMLWPS - number of kernel threads associated with the traced
process,

 - PT_GETLWPLIST - get the current thread list,

 - PT_SUSPEND - suspend the specified thread,

 - PT_RESUME - resume the specified thread.
</end quote>

It seems that PT_GETNUMLWPS and PT_GETLWPLIST are no longer needed.
PT_RESUME is replaced by PT_CONTINUE (with negative value to handle threads).
pl_flags in ptrace_lwpinfo won't be needed as it's ugly design of FreeBSD to
duplicate all the code in this struct, that does not belong to single thread.
pl_child_pid is handled by (ptrace_state.)pe_report_event==PTRACE_FORK and the
PID value is stored in pe_other_pid. pl_siginfo is odd as we can just intercept
signal SIGINFO and check l_sigwaited from <sys/lwp.h> (if not I will add test
and compare needed functionality to be sure). So in the end SIGINFO must be
researched, debug registers implemented, PT_SUSPEND designed and implemented.

Commit message:
http://mail-index.netbsd.org/source-changes/2016/11/15/msg079082.html
http://mail-index.netbsd.org/source-changes/2016/11/15/msg079084.html
http://mail-index.netbsd.org/source-changes/2016/11/15/msg079086.html

2016-11-16

Today I started researching the <pthread_dbg.h> interface. I added three tests
dummy1, dummy2 and dummy3 in a new lib/libpthread_dbg/t_dummy file.

Excuse me for not describing the interfaces in detail and limiting my notes to
commit logs. This is caused by focusing on the code and finishing the work at..
5 AM.

Add new test-suite t_dummy for libpthread_dbg

At the moment this test does nothing except reports failure from td_open()
for overloaded (implemented) dummy1_proc_lookup() (.proc_lookup from
td_proc_callbacks_t) of the following form:

static int
dummy1_proc_lookup(void *arg, const char *sym, caddr_t *addr)
{
        return TD_ERR_ERR;
}

This file and directory with tests is placeholder for new ones, without
further need to alter mtree and distribution sets.

The libpthread_dbg interface and library is used by gdb(1) to handle
threads in applications.

Sponsored by <The NetBSD Foundation>


Add new test dummy2 in lib/libpthread_dbg/t_dummy

This tests implements:
 - .proc_read with memcpy(3)
 - .proc_write with memcpy(3)
 - .proc_lookup with dlopen(3) and dlsym(3) combination

td_open() is verified to return with success (TD_ERR_OK), followed by a
call to td_close().

This test does nothing else than initializing and deinitializing td_proc_t
structure, with appropriately implemented functions.

Sponsored by <The NetBSD Foundation>

Add dummy3 in lib/libpthread_dbg/t_dummy

This test verifies that it's not possible to attach twice to the same
process with td_open() -- it asserts failure with status TD_ERR_INUSE.

This test does nothing besides initializing and deinitializing pthread_dbg
debugging instance.

Refactor code to be more reusable.

Set proper description of dummy2.

Sponsored by <The NetBSD Foundation>.


My short comment is that, I failed to get some useful information on usage from
gdb(1) - its code is very complicated. For example I failed to get a way to
translate asciiz form symbol name to its address in process memory.. I had to
research it myself. Joerg suggested dladdr(3), and finally dlopen(3) combined
with dlsym(3) getting the proper result. At least the file for NetBSD threads
in gdb(1) is simpler and can be used as inspiration how to use this API. I also
verified that its code makes use of ptrace(2) PT_LWPINFO call in order to scan
all threads.

The debugging code has routines for thread suspension and resume action... but
it doesn't applicable for lwps. For now I need to research other functions from
the pthread_dbg API and add appropriate tests.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/11/16/msg079117.html
http://mail-index.netbsd.org/source-changes/2016/11/17/msg079123.html
http://mail-index.netbsd.org/source-changes/2016/11/17/msg079125.html

2016-11-17

Today I got reports from releng servers that dummy1 and dummy2 for
lib/libpthread_dbg/t_dummy report failures on i386 hosts. For amd64 there is
only dummy3 reproducible.. sometimes. It's nondeterministic whether it works or
not. I filed PR for them and moved on, to research other functions.

I noted that there are currently no consumers of the pthread_dbg API, gdb(1) is
using only a small subset of it, not much beyond td_open() and td_close().

I added locally tests for iteration over threads.. but I'm getting SIGSEGV that
stops the game quickly.

My plan is to cover all the functions there with test and research how they
work and whether they are functional. The code is from 2001 year or so and
without consumers so it might be totally broken. The good news is that I might
totally skip PT_SUSPEND as it's handled on the pthread layer. I don't think
that it's useful to write threads without POSIX threads for Unices (golang
runtime might get reiplemented them). I need to research who and why added lwp
resume for ptrace(2) in NetBSD.

There are 2 weeks left, I think it's best to focus in that time on pthread_dbg.
If this is everything what I will do, I will be able to move on to LLDB just
now.

I was also researching possibility for interprocess dlsym(3).. there seems to
be nothing readily available. I will try to dlopen("someprogram", ...). I need
this to implement pthread_dbg for ptraced(2) process.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/11/17/msg079137.html

2016-11-18

I finally sorted the issues with pthread_dbg... in the end culrpit was running
debugger for pthread_dbg. It's explained in the following commit:

Fix basic_proc_read in pthread_dbg functions


Source and destination were swapped. The source of this confusion was that
running these tests under gdb(1) will generate false positives as it will
initialize pthread__dbg to PID of the debugger. This means that it is
currently not possible to debug pthread_dbg code under a full-stack
debugger using NetBSD debugging library for threads.

This should address:
PR lib/51633 tests/lib/libpthread_dbg/t_dummy unreliable
PR lib/51635: td_thr_iter in <pthread_dbg.h> seems broken

After applying the fix I'm able to run all pthread_dbg tests without
indeterminism. The indeterminism was caused by overwritting source of data
with trash.


Sponsored by <The NetBSD Foundation>


So yes, I filed PR for td_thr_iter() and finally sorted the things up.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/11/18/msg079160.html
http://mail-index.netbsd.org/source-changes/2016/11/19/msg079164.html

2016-11-19

Today I wrote some internal tests for pthread_dbg and found that this library
in several calls tries to access pt_magic and compare with PT_MAGIC.

In general code for thread resume and suspend in pthread_dbg is dummy, I'm
thinging how to make it functional... whether it's possible to just call
_lwp_suspend(2) and _lwp_continue(2)?

Commits today:
http://mail-index.netbsd.org/source-changes/2016/11/20/msg079180.html
http://mail-index.netbsd.org/source-changes/2016/11/20/msg079184.html
http://mail-index.netbsd.org/source-changes/2016/11/20/msg079186.html

2016-11-20

I wrote a mail to tech-kern@ that summarizes my thoughts and plans:

In short we are already in a good position with the existing ptrace(2)
interfaces, as most necessary functions in LLDB are representable by
existing NetBSD specific interfaces.

I didn't want to implement (commit) new interfaces prior matching them
with real-life software like LLDB in a real and testable code-path.


1. fork(2) events are implementable with EVENT_MASK (PT_SET_EVENT_MASK,
PT_GET_EVENT_MASK, PT_GET_PROCESS_STATE). Event name is PTRACE_FORK.

2. vfork(2) events should be implemented with EVENT_MASK and event name
PTRACE_VFORK using the same ptrace(2) functions and structures like fork(2).

I'm holding on, as there is additional event used in the FreeBSD and
Linux world to report vfork(2) parent's continuation after child's
termination. Both events can be combined with the same PTRACE_VFORK
type, just set pe_other_pid to an invalid number when child exits to
distinguish them.

3. New interfaces for PT_LWPINFO are not needed, as this call already
can iterate over a list of all lwps. This ptrace(2) call has different
meaning to the FreeBSD one, as in FreeBSD it returns the lwp that
stopped the process (and the one that switched to debugger), in NetBSD
it retrieves the next lwp with the following rule:

pl_lwpid contains a thread LWP ID.  Information is
returned for the thread following the one with the
specified ID in the process thread list, or for the first
thread if pl_lwpid is 0.  Upon return pl_lwpid contains the
LWP ID of the thread that was found, or 0 if there is no
thread after the one whose LWP ID was supplied in the call.

This interface is planned to be used once pthread_dbg will be
unavailable, as there are dedicated functions for the same purpose
(namely td_thr_iter()).

4. There is no need to extend struct ptrace_lwpinfo to size of
FreeBSD... as there is pthread_dbg that takes this job to inspect thread
(like thread name, sigmask, ...).

5. I want to use pthread_dbg for the following operations:
 - inspect threads (retrieve its name, sigmask, whether there are
waiters etc)
 - set/get registers per lwp
 - suspend/resume thread

6. Implementation of the thread resume/suspend operation in thread_dbg
is trivial (I keep it locally) and it's based on new callback functions
SUSPEND() and RESUME():
 - for local process just call there _lwp_continue(2) and _lwp_suspend(2),
 - for remote ptrace(2) call with PT_CONTINUE and [temporarily missing]
PT_SUSPEND.

7. For non-pthread world keep fallback to native ptrace(2) interfaces
with the same features like in pthread_dbg minus inspection of threads.
Keep suspension, resume capability, get/set regs.. skip investigation of
sigmask, name of a process etc.

Usage of pthread_dbg largely simplifies interface between kernel and
user-space and deduplicates information, no need to pass it via the
ptrace_lwpinfo structure... just access it in pthread_t via pthread_dbg.

The main benefit of pthread_dbg is that we are in direct control of
pthread_t state and can read waiters for a thread, thread specific data etc.

8. If there will be need to retrieve more information on lwp, especially
in a single-threaded program (or one using lwp interface directly) I
would extend ptrace(2) with retrieving struct lwp in PT_LWPINFO in a new
field, next to pl_lwpid and pl_event... but I want to skip it for later
to match real-life needs in real debuggers first.

9. To achieve thread suspension and continuation in 5. and 6., I need to
add PT_SUSPEND - as a counterpart to PT_CONTINUE. I want to add there
two modes analogously to PT_SUSPEND:
 - for positive values, suspend the whole process with pid "value"
 - for negative values, suspend lwp with identity "-value"

10. I'm evaluating addition of new types of pl_event (in ptrace_lwpinfo)
- it describes what stopped a thread. Next to PL_EVENT_NONE and
PL_EVENT_SIGNAL, I would add:
 - PL_EVENT_TRACER thread suspended by a debugger calling PT_SUSPEND
 - PL_EVENT_LWP thread suspended by _lwp_suspend(2) from user-space

Like previously, I'm not sure whether these new PT_EVENT types will be
used and useful at all in real-application so I will hold on with them.

11. Add support for CPU debug registers. Unlike the above parts, this
one could be clearly ported as is from FreeBSD right now. It's also very
useful and needed to set watchpoints in memory.

These are currently used by all supported LLDB targets but NetBSD, so:
Linux, Android, MacOSX, Windows, FreeBSD.

12. There are code paths for the SIGINFO event in LLDB, I haven't been
evaluating it so far, as it looks like a low priority for now. In worst
scenario there will be need to add new EVENT_TYPE for SIGINFO and pass
siginfo struct there, but it's not certain and I delayed it for later
(if ever). For the first look we can just capture regular signal,
compare it with regular functions and handle ksiginfo struct with..
perhaps wait6(2)-like function. It's not researched and for now skipped.

My plan for the coming days:

A. Add introductory man-pages for pthread_dbg, currently just for the
used functions in the existing ATF tests, as other interfaces might be
altered later... or just dropped as unneeded. This library keeps having
dept from Scheduler Activation times, and that shall be just revamped.

B. Implement debug registers, base this code on FreeBSD. Add ATF tests,
commit to master repository.

C. Implement locally PT_SUSPEND and keep it on a local branch.

D. Implement locally PTRACE_VFORK (right now just for calling vfork(2)
and for creating a child) and keep it on a local branch.

... switch to LLDB

I also added new tests in pthread_dbg in the t_threads file, fixed function to
retrieve thread's name.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/11/20/msg079199.html
http://mail-index.netbsd.org/source-changes/2016/11/20/msg079203.html
http://mail-index.netbsd.org/source-changes/2016/11/20/msg079204.html
http://mail-index.netbsd.org/source-changes/2016/11/20/msg079207.html
http://mail-index.netbsd.org/source-changes/2016/11/21/msg079213.html

2016-11-21

I documented basic pthread_dbg(3) interfaces:
 - pthread_dbg(3)
 - td_close(3)
 - td_map_pth2thr(3)
 - td_open(3)
 - td_thr_getname(3)
 - td_thr_info(3)
 - td_thr_iter(3)

I added 4 tests for td_map_pth2thr(3).

I marked thread_type in td_thread_info_st as unneeded. This used to distinguish
kernel and user threads... however there are no posix threads in the kernel!

There are many more functions and members in td_thread_info_st, however I need
to hold on with working with them and switch to CPU debug registers tomorrow.
I plan to spend 3 days for them.

Today I got an idea how to implement interprocess dlsym(3), the point is to
scan process map (like /proc/curproc/map) and later use elf(3) to locate
symbols. There are dedicated sysctl(7) calls for this purpose to scan map of
mappings.. however in the past I couldn't find any users. This is tempting me
to spend some time on it, I could make use of it in the pthread(2) tests...
and perhaps it's now far from standalone debugger. There is still need for
addr2line like features, stepping the code and so on - it wouldn't be as
addvanced as gdb(1) or lldb(1), but good enough for every architecture and
plain C applications. I don't know how would C++ demangling symbols affect the
application and it's not something I would consider adding. On the other hand
there is already radare2 that is far from gdb(1), but it could be used as
inspiration. Major shortcoming in radare2 (at least for me) is its unintuitive
interface, zoo of commands, shortcuts, capabilities and hard to keep track in
memory of them all... even there is little point as it's evolving rapidly.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/11/21/msg079240.html
http://mail-index.netbsd.org/source-changes/2016/11/21/msg079242.html
http://mail-index.netbsd.org/source-changes/2016/11/22/msg079244.html
http://mail-index.netbsd.org/source-changes/2016/11/22/msg079245.html
http://mail-index.netbsd.org/source-changes/2016/11/22/msg079246.html
http://mail-index.netbsd.org/source-changes/2016/11/22/msg079249.html
http://mail-index.netbsd.org/source-changes/2016/11/22/msg079253.html
http://mail-index.netbsd.org/source-changes/2016/11/22/msg079254.html
http://mail-index.netbsd.org/source-changes/2016/11/22/msg079255.html
http://mail-index.netbsd.org/source-changes/2016/11/22/msg079256.html

2016-11-23

I've added 102 ptrace(2) tests, for: PT_READ_D, PT_WRITE_D and PT_IO
(PIOD_READ_D, PIOD_WRITE_D). The tests were added for all wait(2)-like
functions and this quickly multipled the overall new tests to 102.

I keep duplicating code in the tests quite consciently as I want to keep them
copy-and-pasteable, and quite often I need to test them out of the ATF context,
including on other non-NetBSD hosts.

Why READ and WRITE and not DEBUG registers? I needed to verify that I can
easily get address of variables and manipulate them via ptrace(2). Actually
in tracee and tracer and with fork(2) virtual addresses seem to be the same.

If it is not true, it will break and debug registers (they will use the same
mechanism) won't be culprit.

Today I also sent (prompted by Thomas Klausner) e-mail with a patch for mdocml
to include diff for pthread_dbg(3) - oneliner to recognize this library.

Plan for tomorrow: start up with porting amd64 specific debug registers code
from FreeBSD.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/11/22/msg079267.html
http://mail-index.netbsd.org/source-changes/2016/11/22/msg079269.html
http://mail-index.netbsd.org/source-changes/2016/11/23/msg079274.html
http://mail-index.netbsd.org/source-changes/2016/11/23/msg079275.html
http://mail-index.netbsd.org/source-changes/2016/11/23/msg079277.html

2016-11-23

I planned to add today tests for all read&write-instruction type ptrace(2)
calls like PT_READ_I, PT_WRITE_I and PIOD_READ_I & PIOD_WRITE_I from PT_IO.

I also added 3 hanshake functions for Data types.

In the end this will summarize what happened today:
Add {,io_}read_i[1234] in t_ptrace_wait{,3,4,6,id,pid}

New tests are direct counterparts to the existing ones {,io_}read_d[1234].

PT_READ_D and PIOD_READ_D (from PT_IO) are traditionally used to transfer
data segment. New tests make use of PT_READ_I and PIOD_READ_I (from PT_IO)
in order to copy text segment from traced process.

    Traditionally, ptrace() has
    allowed for machines with distinct address spaces for
    instruction and data, which is why there are two requests:
    conceptually, PT_READ_I reads from the instruction space
    and PT_READ_D reads from the data space.  In the current
    NetBSD implementation, these two requests are completely
    identical.

    --- ptrace(2)

New tests follow the traditional convention and copy only instructions from
dummy functions.

There are no new tests copying data to .text regions from the traced
process, as this is violating mprotect restrictions. This limitation may be
addressed in future, as there are dedicated sysctl(7) handlers for
debuggers (currently mainly gdb(1)).

Tests verifying identical behavior of PT_READ_I and PT_READ_D are not
planned.

Sponsored by <The NetBSD Foundation>

Add regs[12345] in t_ptrace_wait{,3,4,6,id,pid}

Add new ATF tests for the general purpose register calls.

These tests require platforms to export all of the following macros:
 - PT_GETREGS
 - PT_SETREGS
 - PTRACE_REG_PC
 - PTRACE_REG_SET_PC
 - PTRACE_REG_SP
 - PTRACE_REG_INTRV

This has been done for the sake of C preprocessor magic simplicity.
There are ports without covering all of the above symbols -- skip them.

Added tests
===========

regs1:
    Verify plain PT_GETREGS call without further steps

regs2:
    Verify plain PT_GETREGS call and retrieve PC

regs3:
    Verify plain PT_GETREGS call and retrieve SP

regs4:
    Verify plain PT_GETREGS call and retrieve INTRV

regs5:
    Verify PT_GETREGS and PT_SETREGS calls without changing regs

Sponsored by <The NetBSD Foundation>

Remove duplicated PT_DUMPCORE description in machine-specific calls section

This function is part of the general ptrace(2) interface.

Sponsored by <The NetBSD Foundation>

Today I notet on OpenBSD 6.0 (released in 2016) release notes:
    Fix ptrace PT_WRITE_D that returned EFAULT (broken in
    src/sys/kern/sys_process.c r1.33).

   -- https://www.openbsd.org/plus60.html

CVS notes that 1.33 was... Sun Dec 11 21:30:31 2005 UTC

This convinced me again that added tests are very needed.

I've added locally new test regs6 inspired by regress one from OpenBSD to set
unaligned PC register. I added it and it seems to work without error on NetBSD,
I need to run it on Linux and FreeBSD to get comparison, I would expect EINVAL
or simething...

Plan for tomorrow it to finish this regs6 and switch finally to debug
registers. The last two days were rather exhausting and they seemed to be bit
out of the needed goal to achieve.. but I still consider them as needed.

Today I added 78 or so tests.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/11/23/msg079285.html
http://mail-index.netbsd.org/source-changes/2016/11/23/msg079286.html
http://mail-index.netbsd.org/source-changes/2016/11/23/msg079287.html
http://mail-index.netbsd.org/source-changes/2016/11/23/msg079289.html
http://mail-index.netbsd.org/source-changes/2016/11/24/msg079291.html
http://mail-index.netbsd.org/source-changes/2016/11/24/msg079294.html
http://mail-index.netbsd.org/source-changes/2016/11/24/msg079295.html

+ unrelated to funded project:
http://mail-index.netbsd.org/source-changes/2016/11/24/msg079296.html

as I was looking whether there are any ptrace(2) tests out there

2016-11-24

I abandoned for now test for unaligned PC and moved on. I added two basic tests
for fpregs.. and moved to debug registers.

I'm mainly focused on Intel x86 architecture.. but it quickly got apparent that
there is need concurrent work on 32 and 64-bit variants, in order to make the
code sane. There is extra work in mcontext, ucontext magic.. and procfs.
There is also code for compat32 around...

There is one mystery with x86 architecture, I noted in some resources that
there are 16 debug registers on amd64, however I cannot find this information
in Intel specs.. I will trust that 8-15 ones are reserved.. and just for the
simplicity omitted.

I was thinking whether to trade more extensive debug registers implementation
for PT_CONTINUE and EVENT_VFORK... however I decided to go harder way now and
finish everything on high cost on time.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/11/24/msg079321.html

2016-11-25

Today I unveiled that debug registers are affecting cpu_switch.. and related
calls. It's getting worse than expected.

I got confirmation that 80386 already had debug registers in the same shape as
today is present in newer i386 cpus.

I started code for 32-bit debug regs.. and I will resume it tomorrow.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/11/25/msg079349.html

2016-11-26

I wrote a mail to tech-kern@ that I will focus till end of this month only on
the debug registers and skip PT_SUSPEND and EVENT_VFORK.

There are more unexpected issues out there. I unveiled that there debug
registers are already in use by KSTACK_CHECK_DR0 on i386.. it's an option to
protect kernel stack:

     options KSTACK_CHECK_MAGIC
     Check kernel stack usage and panic if stack overflow is detected.  This
     check is performance sensitive because it scans stack on each context
     switch.

This code was added by yamt:
https://github.com/IIJ-NetBSD/netbsd-src/commit/d9961ce2e70a1bee4c9f15d45d19a244cf16c3cc

It adds / uses existing functions on i386.. dr0(), rdr6(), ldr6().

There were two functions rdr6() and ldr6() for amd64, but perhaps not used and
rdr6() looks broken as not returning value with proper register - I will make
sure tomorrow.

I've published early draft - incomplete:
http://www.netbsd.org/~kamil/dbg4.txt

Plan for tomorrow is to understand cpu_switch() and add there support for debug
registers. Next step is to handle triggering of the debug register when it will
fire. And later on, fix ucontext_t size for 32-bit and rump.. Finally add
32-bit missing code and start testing..

There are 4 days left, not many but it's still doable to get it functional till
then.

I also noted that XEN is handled like a separate architecture and needs special
calls to get and set debug regs.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/11/27/msg079374.html

2016-11-27
Today, I have cleaned up accessors for DRx and committed code for i386, amd64
and XEN.

Add accessors for available x86 Debug Registers

There are 8 Debug Registers on i386 (available at least since 80386) and 16
on AMD64. Currently DR4 and DR5 are reserved on both cpu-families and
DR9-DR15 are still reserved on AMD64. Therefore add accessors for DR0-DR3,
DR6-DR7 for all ports.

Debug Registers x86:
 * DR0-DR3  Debug Address Registers
 * DR4-DR5  Reserved
 * DR6      Debug Status Register
 * DR7      Debug Control Register
 * DR8-DR15 Reserved

Access the registers is available only from a kernel (ring 0) as there is
needed top protected access. For this reason there is need to use special
XEN functions to get and set the registers in the XEN3 kernels.

XEN specific functions as defined in NetBSD:
 - HYPERVISOR_get_debugreg()
 - HYPERVISOR_set_debugreg()

This code extends the existing rdr6() and ldr6() accessor for additional:
 - rdr0() & ldr0()
 - rdr1() & ldr1()
 - rdr2() & ldr2()
 - rdr3() & ldr3()
 - rdr7() & ldr7()

Traditionally accessors for DR6 were passing vaddr_t argument, while it's
appropriate type for DR0-DR3, DR6-DR7 should be using u_long, however it's
not a big deal. The resulting functionality should be equivalent so stick
to this convention and use the vaddr_t type for all DR accessors.

There was already a function defined for rdr6() in XEN, but it had a nit on
AMD64 as it was casting HYPERVISOR_get_debugreg() to u_int (32-bit on
AMD64), truncating result. It still works for DR6, but for the sake of
simplicity always return full 64-bit value.

New accessors duplicate functionality of the dr0() function available on
i386 within the KSTACK_CHECK_DR0 option. dr0() is a specialized layer with
logic to set appropriate types of interrupts, now accessors are designed to
pass verbatim values from user-land (with simple sanity checks in the
kernel). At the moment there are no plans to make possible to coexist
KSTACK_CHECK_DR0 with debug registers for user applications (debuggers).

     options KSTACK_CHECK_DR0
     Detect kernel stack overflow using DR0 register.  This option uses DR0
     register exclusively so you can't use DR0 register for other purpose
     (e.g., hardware breakpoint) if you turn this on.

The KSTACK_CHECK_DR0 functionality was designed for i386 and never ported
to amd64.

Code tested on i386 and amd64 with kernels: GENERIC, XEN3_DOMU, XEN3_DOM0.

Sponsored by <The NetBSD Foundation>


Commits today:
http://mail-index.netbsd.org/source-changes/2016/11/27/msg079375.html

2016-11-28

I finally understood more parts behind cpu_switchto on amd64. I'm amazed by the
clean design and clarity of the implementation.. it was certainly deeply
researched. However it's still not the case that I'm understaind it well enough
to add new and correct code. Today I have idea how to change cpu_switch to add
support for CPU Debug Registers on x86. I will try to implement it tomorrow and
add bits to handle interrupts and transfer them to userland.

There are still odd issues with building distribution and rump.. I will fix it
later. For now AMD64 must be functional first.

I also added a basic test for PT_STEP. At the moment most of the functions in
ptrace(2) all covered with basic ptrace(2) checks. Still missing are the
following ones: PT_DUMPCORE, PT_SYSCALL, PT_SYSCALLEMU, PT_LWPINFO. i386 has
also PT_GETXMMREGS and PT_SETXMMREGS.

The procfs interface is still untouched and for now not planned to be touched..
however it's good to align it with capabilities to native ptrace(2) syscalls.

Ther are two planned days to finish this task, I'm not sure it will be ready
but at least it's still doable to do without greatly extending deadlines.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/11/28/msg079396.html

2016-11-29

I added more tests for PT_STEP and initial ones for CPU Debug Registers
(dbreg1, dbreg2).

Commits today:
http://mail-index.netbsd.org/source-changes/2016/11/29/msg079402.html
http://mail-index.netbsd.org/source-changes/2016/11/29/msg079404.html

2016-11-30

It was a long day. I mailed Christos Zoulas with the current status and request
to help transform my draft of CPU Debug Registers support to the final version.

I think I finished all the needed bits for amd64 (modulo bugs).

I published the code as: http://www.netbsd.org/~kamil/dbg7.txt

My kernel booted for the first time, it passed existing tests for t_ptrace and
additional ones for t_ptrace_wait, including the simple ones for dbregs
accessors! It was surprising.

I started to prepare new tests dedicated for amd64, using dbregs specific parts
out there on amd64.. I refactored t_ptrace_wait.c to:
 - t_ptrace_wait.h       - common code
 - t_ptrace_wait.c       - old MI tests
 - t_ptrace_amd64_wait.c - placeholder for new amd64-only tests

However I ran into issue that sometimes (in a reproducible way) gcc(1) crashes.

I mailed Christs with ask for review and I rescheduled this for tomorrow. It
was a long day.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/11/30/msg079414.html

2016-12-01

I fixed several simple issues in dbg7.txt and created dbg8.txt.

I fixed also evbarm64-aarch64 build in t_ptrace_wait (kill1 and kill2).

I created initial tests/kernel/arch/amd64 debug register tests.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/12/01/msg079436.html
http://mail-index.netbsd.org/source-changes/2016/12/01/msg079437.html
http://mail-index.netbsd.org/source-changes/2016/12/02/msg079439.html
http://mail-index.netbsd.org/source-changes/2016/12/02/msg079446.html
http://mail-index.netbsd.org/source-changes/2016/12/02/msg079448.html

2016-12-02

I added more tests to check that Debug Registers are really preserved and
I refactored test names out there to be more verbose.. it would look bad to
have there 100 tests all called dbregsXYZ.

I also discovered that I was using wrongly (?) accessors for registers as I
need to specify explicit LWPINFO.. so I added two test cases for PT_LWPINFO and
of course there is something bad there as I think that PL_EVENT_SIGNAL is not
set accordingly on signal.

More in commit messages:

Define new tests for CPU Debug Registers in t_ptrace_wait{,3,4,6,id,pid}

Rename dbregs1 to dbregs_print
Rename dbregs[2345] to dbregs_preserve_dr[0123]

Add new tests dbregs_preserve_dr[0123]_yield.

dbregs_preserve_dr0_yield:
     Verify that setting DR0 is preserved across ptrace(2) calls with
     scheduler yield

dbregs_preserve_dr1_yield:
     Verify that setting DR1 is preserved across ptrace(2) calls with
     scheduler yield

dbregs_preserve_dr2_yield:
     Verify that setting DR2 is preserved across ptrace(2) calls with
     scheduler yield

dbregs_preserve_dr3_yield:
     Verify that setting DR3 is preserved across ptrace(2) calls with
     scheduler yield

Add new tests dbregs_preserve_dr[0123]_continued.

dbregs_preserve_dr0_continued:
    Verify that setting DR0 is preserved across ptrace(2) calls and with
    continued child

dbregs_preserve_dr1_continued:
    Verify that setting DR1 is preserved across ptrace(2) calls and with
    continued child

dbregs_preserve_dr2_continued:
    Verify that setting DR2 is preserved across ptrace(2) calls and with
    continued child

dbregs_preserve_dr3_continued:
    Verify that setting DR3 is preserved across ptrace(2) calls and with
    continued child

Use more meaningful names for these tests as they are MD specific and
testing precise functionality. Also there will be a growing number of
tests in this category and prefixing everything with plain dbregs and
trailing with a number cannot be verbose.

Sponsored by <The NetBSD Foundation>

Add new tests lwpinfo1 in t_ptrace_wait* and lwpinfo2 under HAVE_PID guard

lwpinfo1:
    Verify baic LWPINFO call for single thread (PT_TRACE_ME)

lwpinfo2:
    Verify baic LWPINFO call for single thread (PT_ATTACH from tracer)

Both tests are marked as expected failure PR kern/51685:
    ptrace(2): Signal does not set PL_EVENT_SIGNAL in
    (struct ptrace_lwpinfo.)pl_event

Sponsored by <The NetBSD Foundation>

Commits today:
http://mail-index.netbsd.org/source-changes/2016/12/03/msg079469.html
http://mail-index.netbsd.org/source-changes/2016/12/03/msg079474.html

2016-12-03

Today I learnt that ptrace(2) calls like GETDBREGS can accept LWP=0, and it has
special meaning that it will process the first thread.

This code
http://netbsd.org/~kamil/dbg9.txt

Hangs here
https://nxr.netbsd.org/xref/src/sys/kern/kern_sig.c#1579

Stacktrace:
trap
userret
mi_userret
lwp_userret
issignal
sigswitch -- kern_sig.c#1579

In kernel/arch/amd64/t_ptrace_wait tests like dbregs_dr0_trap_variable.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/12/04/msg079498.html

2016-12-04

I was testing the code and it happens that there is additional issue with
pool_put() in cpu_lwp_free(), it's generating panic like:

$ sudo crash -M ./netbsd.1.core  -N /netbsd.gdb
Crash version 7.99.42, image version 7.99.43.
WARNING: versions differ, you may not be able to examine this image.
System panicked: kernel diagnostic assertion "l->l_stat == LSONPROC"
failed: file "/usr/src/sys/kern/kern_sleepq.c", line 205
Backtrace from time of crash is available.
crash> bt
_KERNEL_OPT_NVGA_RASTERCONSOLE() at 0
?() at fffffe87af2cb7e8
log() at log
kern_assert() at kern_assert+0x56
sleepq_enqueue() at sleepq_enqueue+0xac
turnstile_block() at turnstile_block+0x476
mutex_enter() at mutex_enter+0x4f9
pool_put() at pool_put+0x2a
cpu_lwp_free() at cpu_lwp_free+0xb6
exit1() at exit1+0x1054
exit1() at exit1
sy_call() at sy_call+0x40
sy_invoke() at sy_invoke+0xd5
syscall() at syscall+0xff
--- syscall (number 1) ---
7ef55630f06a:
crash>

I mailed Christos with my thoughts to put dr[012367] directly into the pcb
structure. I think I will give it a try. The only showstopper is that in future
hardware might offer functionalities via all registers from dr0 to dr15 and it
will need to be refactored at that time.

Today Christos fixed issue with SIGNAL in lwpinfo1/lwpinfo2.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/12/04/msg079511.html

2016-12-05

Today Christos fixed really lwpinfo2 and refactored IPC mechanism in the
t_ptrace_wait family of tests.

I was researching trap(3) on amd64 for cpu debug register tracepoint fire
event. So far everything up to userret(9) looks fine, there is detected
user-space and tracepoint-type trap. It emits signal and later enters the
userret(9) function.

I don't have understanding of the overall process so far to spot mistake, this
is why I'm still trying to debug it... hopefully soon I will be enlightened
what's wrong.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/12/05/msg079513.html

I also added small local cleanup in riscv code:
http://mail-index.netbsd.org/source-changes/2016/12/05/msg079514.html

2016-12-06

Today I learnt that there exists exect(2).. an exec(2)-like function with
entering into tracee mode.

I also found that FIX_SSTEP() is dummy and nobody and nothing is using it and
apparently nothing ever used it. The only user of it is in... OpenBSD -- alpha
port.

I also learnt (I was told by Rin Okuyama) that gdb(1) emulates PT_STEP for
alpha on NetBSD and OpenBSD, however we still desire to get proper native
implementation.

I've found the key to detect mistake in my CPU Debug Registers code... it's
likely in... cpu_switchto(9). I've narrowed the issue to:
issignal() -> sigswitch() -> mi_switch() -> cpu_switchto() [hang]

I will resume debugging it tomorrow with verifying cpu_switchto().

I'm comparing behavior of PT_STEP trap with CPU debug register trap and this
moved me to that point.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/12/06/msg079542.html

2016-12-07

It was difficult day.. I have found that my concept ot mcontext, PT_GETDBREGS
and PT_SETDBREGS is broken by design. The concept wasn't originally mine, as it
was adapted from FreeBSD. In general watchpoits should get dedicated accessors
in ptrace(2) like PT_SETWATCHPOINTS/PT_SETWATCHPOINTS as ports can use their
unerlaying MD code for implementing PT_STEP. In that case we cannot allow user
to mess with it manually.

Today I was suggested to check sh3 implementation of UBC, it has code that sets
debug registers on entering user-land and removing them on entering kernel.
This should be the appropriate design.

I'm learning about userret(9)... and things, but hey - why am I so much after
the deadline for ptrace(2).

I also noted that:
RETURN VALUES
     The cpu_switchto() function does not return until another LWP calls
     cpu_switchto().  It returns the oldlwp argument of the cpu_switchto()
     which is called to switch back to our LWP.  It is either a LWP which
     called cpu_switchto() to switch to us or NULL in case the LWP was
     exiting.

I'm learning about Interrupt Gate, Trap Gate, Call Gates.. and amd64 specific
code for it.

I need also differentiate regardless of underlaying hardware that breakpoints
are only for code & reading and watchpoints for data & reading|writing.

I was thinking to implement per-process and per-lwp watchpoints, maybe with
something like:

struct ptrace_watchpoint {
       lwpinfo_t pw_lwpid; /* 0 - all lwps within a process */
       void *pw_addr; /* 0 - disabled */
       int pw_mode; /* 01 - read; 10 - write; 11 - read/write */
}

Maybe these ptrace(2) calls are fine:
 - PT_GETWATCHPOINT
 - PT_SETWATCHPOINT
 - PT_NUMWATCHPOINTS

These calls would work dynamically regardless of changes in the code, like
extending ptrace_watchpoints with new members or adding support for more
watchpoints in general.

Lecture for tomorrow:
http://eli.thegreenplace.net/2011/01/27/how-debuggers-work-part-2-breakpoints
http://stackoverflow.com/questions/3425085/the-difference-between-call-gate-interrupt-gate-trap-gate
http://wiki.osdev.org/Interrupt_Descriptor_Table

Commits today:
http://mail-index.netbsd.org/source-changes/2016/12/07/msg079563.html

2016-12-08

It was a sad day, all the day long src/ was broken (new audio code and npf
enhancements). In the end I have not added new code for watchpoints, I was
learning breakpoints and I have added a simple test for exec(3) it uncovered
that this interface was disfunctional since day0.

I filed a PR for it: port-amd64/51700:
      exect(NULL,NULL,NULL) generates 15859 times SIGTRAP on amd64

There was also some discussion (mails from me):

===============================================================================

 Since this interface is apparently unclear I will add few sentences.
 
 The exect(3) call is an extension to execve(2),  it's added
 functionality is to raise SIGTRAP with si_code TRAP_TRACE.
 
 Currently mainly vax, i386 and amd64 implement this interface through
 libc, however with this is more a hack, as these port set machine flag
 used to step the code and resumes with execve(2).
 
 If we want to keep its implementation in libc I think the best solution
 is to switch this code to the sequence of raise(3) && execve(2) for all
 ports (how to set si_code to TRAP_TRACE in this case?).
 
 However I think it's little point to trace libc's internals and it might
 be more useful to raise signal just before executing the first
 instruction and this is doable
 
 The exect(3) function was added on day0 (inherited from 386bsd) and
 perhaps unused since then. The man-page describes it as follows (text
 unmodified since that time):
 
   The function exect() executes a file with the program tracing
   facilities enabled (see ptrace(2)).
 
 This sounds to me like exect(3) should be equivalent to PT_TRACE_ME &
 execve(2) & raise(SIGTRAP with si_code=3DTRAP_TRACE) on first instruction=
 
 of the new image.
 
 This call is currently unimplemented port-wide and perhaps not really
 functional with ports that implement it either, if we want to fix it I
 think we are free to alter its meaning and make it more useful now.
 
 I checked 386bsd sources and it's the same as the current i386 version
 -- without particular users. I think the proposed scenario with
 PT_TRACE_ME is the only variation that might be useful.
 
 exect(3) is kept in FreeBSD with the original questionable behavior and
 absent in OpenBSD
 

===============================================================================

 Good point, if exect(3) is going to be a programmer-friendly utility
 (and since it already reserves its entry in libc), I'm for a breakpoint
 on the main() function.
 
 In that scenario we would get debugger code like:
 
 int status;
 child =3D fork();
 if (child =3D=3D 0)
     exect("/bin/cat", "cat", NULL);
 wait(&status);
 ptrace(...)
 
 On 09.12.2016 11:03, Paul Goyette wrote:
 > Just curious....  (private response, feel free to forward if you find i=
 t
 > useful)
 >=20
 >> This sounds to me like exect(3) should be equivalent to PT_TRACE_ME &
 >> execve(2) & raise(SIGTRAP with si_code=3D3DTRAP_TRACE) on first instru=
 ction
 >=20
 > The term "first instruction" could be interpreted as the beginning of
 > the executable's main() procedure.  Or it could be interpreted as the
 > beginning of crt0 code for the executable.
 >=20
 > I'm not sure which one you mean, but whatever decision is made, it need=
 s
 > to be clearly and unambiguously documented.  


===============================================================================

I also have an impression that something hard reboots my computer with recent
kernels, I hope the issues will be sorted out.

Another lecture for tomorrow.
http://eli.thegreenplace.net/2011/02/07/how-debuggers-work-part-3-debugging-information

Commits today:
http://mail-index.netbsd.org/source-changes/2016/12/08/msg079592.html
http://mail-index.netbsd.org/source-changes/2016/12/09/msg079612.html
http://mail-index.netbsd.org/source-changes/2016/12/09/msg079631.html
http://mail-index.netbsd.org/source-changes/2016/12/09/msg079633.html
http://mail-index.netbsd.org/source-changes/2016/12/09/msg079634.html

2016-12-09

I've started to write new-world code for debug registers support on amd64.

I've finally decided to allow to use all reasonable HW-specific features
available on amd64, however don't let users to manually edit DB registers as
it's uncomfortable and sensitive to mistakes.

This means that users can use hw instruction watchpoints (breakpoints) and
data watchpoints. I decided that it's good to just call it universally that
watchpoints are in the kernel and breakpoints are software traps.

I put the patch at http://www.netbsd.org/~kamil/dbg12.txt

I learnt that local and global watchpoints differs in terms that local are used
along with hardware threads. This feature is not common in recent popular OSes.

What seems to be done:
 - x86 generic code for set_x86_hw_watchpoint() to enable HW watchpoint
   it's designed to be called before returning to user-space
 - x86 generic code for clear_x86_hw_watchpoints()
   it's designed to be called before entering to kernel-space
 - attach set_x86_hw_watchpoint() in userret(9)
 - add amd64 stubs for PT_GETWATCHPOINT, PT_SETWATCHPOINT and PT_NUMWATCHPOINTS
 - design struct ptrace_watchpoint in MI code
+#ifdef __HAVE_PTRACE_WATCHPOINTS
+/*
+ * Hardware Watchpoints
+ *
+ * MD code handles switch informing whether a particular watchpoint is enabled
+ */
+typedef struct ptrace_watchpoint {
+	 int			  pw_index;	/* HW Watchpoint ID (count from 0) */
+	 lwpid_t		  		pw_lwpid;	 /* LWP described (0 - means all) */
+	 struct mdpw				pw_md;		    /* MD fields */
+} ptrace_watchpoint_t;
+#endif
 - design struct mdpw in MD amd64 code
+/*
+ * This MD structure translates into x86_hw_watchpoint
+ *
+ * pw_address - 0 represents disabled hardware watchpoint
+ *
+ * conditions:
+ *     0b00 - execution
+ *     0b01 - data write
+ *     0b10 - io read/write (not implemented)
+ *     0b11 - data read/write
+ *
+ * length:
+ *     0b00 - 1 byte
+ *     0b01 - 2 bytes
+ *     0b10 - undefined
+ *     0b11 - 4 bytes
+ */
+struct mdpw {
+	void *md_address;
+	int   md_condition;
+	int   md_length;
+};


pw_index looks odd and address in mdpw.. I need to keep index of watchpoint
somewhere. I want to use the data argument to specify
sizeof(struct ptrace_watchpoint). Why to put address in MD? I noted that there
might be hardware watchpoint on action like branching (however not in amd64
with CPU debug registers).

Plan for tomorrow:
 - implement ptrace(2) calls to set, get and count watchpoints
 - call clear_x86_hw_watchpoints() in all appropriate places
 - add i386 calls
 - evaluate 32 and 64-bit compatibility

Commits today:
None

2016-12-10

Today I refactored the hardware watchpoint interface to be per LWP and not per
process. It's up to tracer to apply watchpoints to all threads within a
process. This concept is better fit to the current design of kernel I think,
because I'm adding technically code for threads and I was enforcing these
threads to lookup process specific data. I keep thinking that it's useful
the tracer to set watchpoints in per-thread basis.

I also got to conclusion that mixing single step and watchpoints in the same
continuation of threads/process must be prohibited as both mechanisms can share
the same MD features to achieve it. If someone is stepping the code and has
watchpoints - PT_STEP always win. This restriction makes the logic for x86 also
significantly simpler and straightforward as there is no need to handle special
cases in hw watchpoints.

I studied a document about hardware watchpoints implementation on Linux, there
(at least in the past) was a mechanism to alloc dynamically watchpoints for
kernel and userspace. The lecture was enlightening and ensuring me that I'm
rather understanding the correct needs. Mainly that kernel must set watchpoints
for all CPUs while user-land only for single CPU (per pthread or per process).

I also got to mind that it's waste of time to disable watchpoints on kernel-
enter paths, it's sufficient to ensure that watchpoints are never set on code
within the kernel and in narrow-cases handle traps when watchpoints are
triggered due to some syscalls (copyin/copyout). It's possible thanks to
userret(9) as it's the only place needed to reset unused watchpoints.

Todo:
 - add amd64 specific code to set/get watchpoints
 - compat32
 test

I really hope that it will work till Monday.

I put temporary code on:
http://www.netbsd.org/~kamil/dbg15.txt

Commits today:
http://mail-index.netbsd.org/source-changes/2016/12/10/msg079681.html
http://mail-index.netbsd.org/source-changes/2016/12/11/msg079710.html

2016-12-11

Today I added missing functionality and I got PT_COUNT_WATCHPOINTS to work....
with small exception that I'm returning the right value with errno. I fill fix
it tomorrow and move on to testing PT_(SET|GET)_WATCHPOINT.

Commits today:
None.

2016-12-12

Today I finished the work on the hardware watchpoint interface in ptrace(2).

I wrote a mail to tech-kern@:

    http://mail-index.netbsd.org/tech-kern/2016/12/13/msg021326.html

Of the following content:

I've prepared interface for hardware watchpoints:

    http://netbsd.org/~kamil/patch-00023-ptrace-watchpoints.txt

For the purpose of this task I propose to call monitoring operations of
data as "watchpoints" and monitoring of instruction's executaion as
"breakpoints". However this interface is not limited to neither, as a
port might expose any other type of events to be monitored (like branch
instructions). Sometimes I'm referring to "hardware" watchpoints it
means just any type of trap realized with hardware association without
need for software trap.


My goals of this project:
 - restrict code in the kernel-side to functional minimum,
 - restrict performance impact to minimum,
 - security - don't expose weaker points,
 - make common MI parts where applicable,
 - if something is doable with a userlevel debugger code, don't put
extra functions to the kernel.

Benefits of this project:
 - hardware breakpoints without violating mproctect restrictions,
 - make possible observability of data changes,
 - basic and common interface to set breakpoints within few lines of
code [1].

[1] Software breakpoints are definitely more complex, as they overwrite
target's .text section inserting there instructions to generate traps..
and then they need to move PC backwards and insert original instruction
for target...


The design is as follows:

1. Accessors through:
 - PT_WRITE_WATCHPOINT - write new watchpoint's state (set, unset, ...),
 - PT_READ_WATCHPOINT - read watchpoints's state,
 - PT_COUNT_WATCHPOINT - receive the number of available watchpoints.

2. Hardware watchpoint API is designed to be MI with MD specialization.
MI parts:
 - ptrace(2) calls as mentioned in 1.

 - struct ptrace_watchpoint of the following shape:

/*
 * Hardware Watchpoints
 *
 * MD code handles switch informing whether a particular watchpoint is
enabled
 */
typedef struct ptrace_watchpoint {
	int    			 pw_index;	/* HW Watchpoint ID (count from 0) */
	lwpid_t			 		pw_lwpid;	 /* LWP described */
	struct mdpw				pw_md;		    /* MD fields */
} ptrace_watchpoint_t;

 - example specialization for amd64:

/*
 * This MD structure translates into x86_hw_watchpoint
 *
 * pw_address - 0 represents disabled hardware watchpoint
 *
 * conditions:
 *     0b00 - execution
 *     0b01 - data write
 *     0b10 - io read/write (not implemented)
 *     0b11 - data read/write
 *
 * length:
 *     0b00 - 1 byte
 *     0b01 - 2 bytes
 *     0b10 - undefined (8 bytes in modern CPUs - not implemented)
 *     0b11 - 4 bytes
 *
 * Helper symbols for conditions and length are available in <x86/dbregs.h>
 *
 */
struct mdpw {
       void *md_address;
       int   md_condition;
       int    md_length;
};

 - I put md_address and others field to MD part as it's purely MD
specific. I wanted to leave room for possible watchpoints of types
without specified address.


3. Do not expose CPU Debug Registers to userland. I finally decided to
restrict the underlying hardware implementation (based on CPU Debug
Registers) to kernel only. In FreeBSD these registers are a part of
machine context (mcontext), I think it's a misdesign as it should be
limited to the tracer only and it has no use-case in the tracee. CPU
Debug Registers can expose privileged gates and in theory can try to
alter watchpoints set by a debugger on the fly.. AMD64 Debug Registers
are rather part of the tracer context observing a tracee.

4. Do not set watchpoints globally per process, limit them to threads
(LWP). In general kernel hardware watchpoints must be set for all CPUs
and userland watchpoints must be limited to CPU running a thread. Adding
process-wide management in the ptrace(2) interface calls adds extra
complexity that should be pushed away to user-land code in debuggers.

5. Do not allow to mix PT_STEP and hardware watchpoint, in case of
single-stepping the code, disable (it means: don't set) hardware
watchpoints for threads. Some platforms might implement single-step with
hardware watchpoints and managing both at the same time is generating
extra pointless complexity.

6. I have no strong opinions on si_code, on amd64 we set TRAP_TRACE for
hardware watchpoints. POSIX specifies two types: TRAP_BRKPT and
TRAP_TRACE. I don't want to introduce new third type as: it might be
unportable across ports, its usability is questionable (I would limit it
myself to curiosity without significant real-life impact).

7. Linux has interfaces to allocate (reserve) watchpoints for in-kernel
usage and user-land one... I think it's unnecessary complexity for
little gain. In the case that someone would use in-kernel hardware
watchpoints I just recommend to stop setting them for userland (it's
doable with a single if() condition...).

8. The design for the amd64 port is as follows:
 - track watchpoints in a private LWP structure, within a table,
 - all threads before entering userland call userret() - at the end of
this function check if a thread has active watchpoints and isn't going
to single-step -- if so set hardware watchpoint
 - never reset hardware watchpoints after entering the kernel, traps
must be specified for memory in the user-level range and mustn't trigger
trap within kernel code

9. I was trying to leave room for in-kernel x86 watchpoints in future.


Currently only the amd64 part is finished and tested. The i386 and XEN
configurations aren't tested, while both might work out of the box.
compat32 isn't fully implemented.

Tomorrow, I plan to add dedicated ATF tests for this interface,
currently a draft that I used is available at:

    http://netbsd.org/~kamil/t_ptrace_wait-hw-watchpoint-draft.c

The 32-bit work is planned to be started once LLDB will be fully
functional on amd64 first, currently it's out of the scope.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/12/12/msg079771.html

2016-12-13

I tried to cleanup the sources for inclusion of hardware watchpoints.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/12/13/msg079795.html
http://mail-index.netbsd.org/source-changes/2016/12/13/msg079798.html
http://mail-index.netbsd.org/source-changes/2016/12/13/msg079799.html
http://mail-index.netbsd.org/source-changes/2016/12/13/msg079802.html
http://mail-index.netbsd.org/source-changes/2016/12/13/msg079802.html
http://mail-index.netbsd.org/source-changes/2016/12/13/msg079803.html
http://mail-index.netbsd.org/source-changes/2016/12/13/msg079804.html
http://mail-index.netbsd.org/source-changes/2016/12/13/msg079805.html
http://mail-index.netbsd.org/source-changes/2016/12/13/msg079806.html
http://mail-index.netbsd.org/source-changes/2016/12/13/msg079808.html

2016-12-14

I've caught regressions in patch-00023-ptrace-watchpoints.txt and addressed
them accordingly... well almost completely. There is still one mysterious bug
causing to pass tests out of the ATF context and breaking inside ATF.

I contacted Pavel Labath from LLDB (Google team) to discuss plan for Process
Plugin. He asked me to prepare diff allowing to build Linux Process Plugin on
NetBSD, as it will unveil the work needed to progress towards creation a shared
code for Linux and BSD.

We scheduled a talk tomorrow to discuss the details.

I've mailed lldb-dev@ with appropriate summary:

Hello,

I've prepared two patches to make the Linux Process Plugin buildable on
NetBSD.

The diff will help to transform the Linux process plugin to common code,
shared between Linux and BSDs (FreeBSD, NetBSD).

lldb-git: Enable Linux Process plugin on NetBSD

https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=4b00674e876ebfe427743759de13ead420112fd4

lldb-git: Disable unbuildable code-fragments in the
LinuxProcessPlugin/NetBSD

https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=e1ef012c16ab7729918ae367150b13bf0d77650b

Comments to the disabled code:

1. Thread resume/suspend operation - to be implemented on NetBSD with
ptrace(2).

2. PTRACE_SETSIGINFO - equivalent currently unsupported, I will need to
add support for it in ptrace(2).

3. PTRACE_O_TRACEEXEC, PTRACE_O_TRACECLONE, PTRACE_O_TRACEEXIT -
equivalent to be implemented.

4. No tracing of VFORK events implemented.

5. Hardware assisted watchpoints (debug registers on amd64) have their
dedicated ptrace(2) API to set/unset watchpoints, they do not export raw
debug registers.

6. Other ptrace(2) calls have their equivalents on NetBSD
(PTRACE_PEEKUSER, PTRACE_POKEUSER etc).

7. SIGTRAP has currently only two si_code values (specified by POSIX).

8. No SI_TKILL available.

9. There is no process_vm_readv/process_vm_writev call available.

10. __WALL and __WNOTHREAD are Linux specific, but they have their
counterparts.

11. The cpu_set_t structure and appropriate functions have similar
counterparts on NetBSD.

12. No <sys/procfs.h> and no dependency of procfs (/proc) is acceptable,
everything shall be accessible through ptrace(2) and sysctl(7).

13. personality.h - unsupported as header/function call, compatibility
with other OSes (mostly Linux) implemented on syscall level.

14. Process, system thread (lwp) and POSIX (pthread_t) thread are not
the same on NetBSD, unlike some other systems and cannot be mixed.


The currently missing features on the NetBSD side don't stop us from
making a functional implementation, lacing interfaces might be added
after getting the core part to work.


  -- http://lists.llvm.org/pipermail/lldb-dev/2016-December/011739.html

Commits today:
http://mail-index.netbsd.org/source-changes/2016/12/14/msg079828.html
http://mail-index.netbsd.org/source-changes/2016/12/14/msg079829.html

https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=4b00674e876ebfe427743759de13ead420112fd4
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=e1ef012c16ab7729918ae367150b13bf0d77650b

2016-12-15

I've fixed the issue with code traps, for some reason culprit was optimization
applied in the ATF tests. I needed to add to a traced function attribute to
stop optimizing it.

I don't understand why it happened this way, however I need to move on to LLDB.

I've committed hardware watchpoint API to master and bumped kernel ABI to
7.99.50!

Commits today:
http://mail-index.netbsd.org/source-changes/2016/12/15/msg079879.html
http://mail-index.netbsd.org/source-changes/2016/12/15/msg079880.html
http://mail-index.netbsd.org/source-changes/2016/12/15/msg079888.html
http://mail-index.netbsd.org/source-changes/2016/12/15/msg079890.html
http://mail-index.netbsd.org/source-changes/2016/12/15/msg079891.html
http://mail-index.netbsd.org/source-changes/2016/12/15/msg079892.html
http://mail-index.netbsd.org/source-changes/2016/12/15/msg079893.html
http://mail-index.netbsd.org/source-changes/2016/12/15/msg079895.html
http://mail-index.netbsd.org/source-changes/2016/12/15/msg079896.html
http://mail-index.netbsd.org/source-changes/2016/12/15/msg079897.html

2016-12-16

I've send a mail to tech-toolchain@:

Hello,

My initial goal is to copy the Linux Process Plugin and add minimal
functional support for NetBSD. It will be followed with running and
passing some tests from the lldb-server test-suite.

This is drift from the original plan about porting FreeBSD Process
Plugin to NetBSD, as the FreeBSD one is lacking remote debugging support
and it needs to be redone from scratch.

I'm going to fork wip/lldb-git (as for git snapshot from 16 Dec 2016)
for the purpose of this task to wip/lldb-netbsd and work there.

Next steps after finishing this task is to sync up with Pavel from the
LLDB team after New Year. The NetBSD Process Plugin will be used as a
reference to create new Common Process Plugin shared between Linux and
(Net)BSD.

This work is sponsored by <The NetBSD Foundation>.

 -- http://mail-index.netbsd.org/tech-toolchain/2016/12/16/msg002883.html

I've prepared the said package in wip/lldb-netbsd.. For some reason llvm.org
went down so I will resume porting LLDB tomorrow.

Changes today:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=b57f5e92f4434b854176027dfde2c3ecbf8eb79a
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=98210e97b7854a76acaa1c4a08aca1228f08a4a4
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=5ee8f6b1b3d3b08795f34c9196e3f1a60fbeb4e4
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=c0f22af0f9002b1b0a9c54b9421bd0af1171a08f
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=6f62b913f0ecd01c3784f6a4489a1ef8d5f22006
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=428b12dc8eb3069dc9161be969d495169c50e310
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=2f77aa7f6a43dcb54d6028d93b79bff280bb70fd
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=385997ae7cefeddfd446b2246a982f8c27ab9e2a

2016-12-17

I was rather day with plain effort on adapting the process plugin to NetBSD.

Changes:
 -- Added clone of Linux Process Plugin - called NetBSD Process Plugin
 -- Initial adaption for Atach, Detach, Continue, Resume
 -- Rip out code for handing registers other than General Purpose
 -- Kill walkaround for PT_STEP for arm64
 -- adapt waitpid(2) for NetBSD
 -- Adapt Ptrace Wrapper for NetBSD API
 -- Initial effort towards abandinging pthread_t handling - we need to operate on LWPs

Changes today:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=d9114262bc976a6c39b3ff3df416f2df73b2a420
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=760a4efd7dc5d2890d7a1937b2c9840bb6e8e1e9
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=45ac397da8dcac1a37fbb79d91e4e13683ce70fc
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=43511d7320f1be16cbbbcf5dc01a34ce70547d62
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=a09654ee29f0cfe5d743fcddc6020e819b7c5e4e
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=0a0ed070694008a8045c54e4c7b5e3889ba18b95
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=3dc57c6a2d251a5f4df392e0a7f943cb3ba88ad1
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=0d0128c4e58c1e00e187a10f9c9af50f7b42133e
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=e517bed1f6c990528b96de32d688d9e3a81c50ff
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=bfc6da3cb81ea68dc04bda67997643a31186cbe0
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=ac978855a3ede2e813524be500c14ee3aa583fa0
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=a3b4b4c76d1563bcc26e61390bbf15d24f0d761c
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=60ba5e7b799a14136e066133b885dff8afd727b9
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=b1d44a3ee2c70b1931443698ef27c66b78a82913
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=e1e364a563cb7c6a54c4fc649f993eabf0e00321
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=308fec3153636bb5cb3905f46632e3cb4d8dcbdd

2016-12-18

The most important change today:
    lldb-netbsd:   Welcome to liblldbPluginProcessNetBSD.a!

Important changes today:
 - kill more i386 code
 - fix PtraceWrapper prototype
 - kill procfs
 - kill more remnants of the PT_STEP walkaround for arm64 on Linux
 - kill Ptrace.h and replace with <sys/ptrace.h>
 - drop completely support for registers as planned to be readed later
 - replace watchpoint accessors with dummy functions
 - mark NetBSD as without need for attach/detach hack
 - add source/Host/netbsd/ProcessLauncherNetBSD.cpp
 - restore old state of lldb-git

Changes today:
http://mail-index.netbsd.org/source-changes/2016/12/19/msg080015.html

https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=5de006af330244259612838e5cdc68b76ee4d9b1
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=0749df83af1aee6c89844700e7157a922065d462
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=2e6f052e8a898bc87b35178f75b76eaad79f4d99
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=433c2776703ad6212780df8e02ab5c04a18f7f3f
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=4fc3d83618d72746c4d99abb91fb64a6c73124c0
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=956b96ca0391f9463b70951b30923102e668a59c
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=dab62628e2e87e637b5255f8d5510d11b097d48b
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=2b575b968243690e01e59520efa0630bfc8a8d58
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=407247ebfa6edcf14b8c8ec4f746757114c08711
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=cde196b09d69fd2d6d8a6cbfe95544b64573ede7
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=cad19bd4f66d2644fa5eb4da2e583a0d17b37398
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=c32936e71211b272ce761e9a460b73eeee39d35b
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=e90f8fb78a9d4db05f2ca451ee71bd25623a1afc
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=5e4df1a15a1954786805aa1ecd26376d0bd26eee
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=60516888555ad93b41fc97a8d0795a0c36972a5d

2016-12-19

Changes in the ported code today:
   - try to add process specific logging in NetBSD
   - synchronize PlatformNetBSD with Linux

LLDB works this way:
(lldb) target create "/bin/cat"
Current executable set to '/bin/cat' (x86_64).
(lldb) r
[ ... hangs ]

I'm trying to determine what should happen and what should be the action to be
waited for. At the moment I don't know what happens and what's wrong.

Changes today:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=1d7f89b83ef5bbb52649149536da95e082e92b9d
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=09dc96039d0b94f69095c13eeb741706dabc836a
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=f240d89b30e61c44e9a09d5a2631df5a7343da76
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=b8b36b2397d3543c9efeb0774dd4f5c6d0314e87
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=383c264ded9be6df6a6df4908143cb5a90f729a6
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=9f24252075e9c03eab06b898ea2cabde02f5e5cf
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=7532cb79202aa44b28b89b5eaad6beab2d8e26e7
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=1a41858f2a08860a19541d8a8ff09e85a3b49bb1

2016-12-20

I got fixed installation of executables under the bin/ directory.

Cherry-pick patch from upstream bugzilla:
   https://llvm.org/bugs/show_bug.cgi?id=31433
Author Jeremy Huddleston Sequoia


I've investigated to the point of making usable in some extent std::call_once
from C++11, as it is used widely in LLDB. It was the original cause of hangs of
LLDB.

 libc++ doesn't use pthread_once(3)
 
 The problem is with TLS used in combination with libstdc++ for
 std::call_once.
 
 Everything works (for single threaded setup safe) if we remove __thread
 from the following fragment:
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 #ifdef _GLIBCXX_HAVE_TLS
   __thread void* __once_callable;
   __thread void (*__once_call)();
 #else
   // Explicit instantiation due to -fno-implicit-instantiation.
   template class function<void()>;
   function<void()> __once_functor;
 
   mutex&
   __get_once_mutex()
   {
     static mutex once_mutex;
     return once_mutex;
   }
 
   // code linked against ABI 3.4.12 and later uses this
   void
   __set_once_functor_lock_ptr(unique_lock<mutex>* __ptr)
   {
     __get_once_functor_lock_ptr() =3D __ptr;
   }
 
   // unsafe - retained for compatibility with ABI 3.4.11
   unique_lock<mutex>&
   __get_once_functor_lock()
   {
     static unique_lock<mutex> once_functor_lock(__get_once_mutex(),
 defer_lock);
     return once_functor_lock;
   }
 #endif
 
   extern "C"
   {
     void __once_proxy()
     {
 #ifndef _GLIBCXX_HAVE_TLS
       function<void()> __once_call =3D std::move(__once_functor);
       if (unique_lock<mutex>* __lock =3D __get_once_functor_lock_ptr())
       {
         // caller is using new ABI and provided lock ptr
         __get_once_functor_lock_ptr() =3D 0;
         __lock->unlock();
       }
       else
         __get_once_functor_lock().unlock();  // global lock
 #endif
       __once_call();
     }
   }
 
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace std
 
  --  src/external/gpl3/gcc/dist/libstdc++-v3/src/c++11/mutex.cc
 
 So far I have not reproduced this issue out of the libstdc++ context.
 Upstream reports that it might be related to ld.so and resolving
 __tls_get_addr, as it was causing trouble in the past.
 
 Perhaps something related:
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D52192
 
 std::call_once is frequently used in LLDB.
 
 I've not reproduced this issue out of the libstdc++ context so far.
 

I've prepared a list of functions called in the following code-flow:

(lldb) r
hello world!
Process 25874 launched: '/public/a.out' (x86_64)
Process 25874 exited with status = 0 (0x00000000) 
(lldb)

The debugger isn't usable still, there is no way to specify a breakpoint or
catch a signal from child and have it functional.

Functions to be altered:
 - GetRegisterContext - not implemented on NetBSD
 - SigchldHandler - adapted for NetBSD
 - Resume (process) - kill software PT_STEP
 - SupportedHardwareSingleStepping - to be killed for now
 - Resume (thread) - kill resuming single thread as it is done on Linux
 - MonitorCallback - adapt for NetBSD
 - StopTrackingThread - adapt for NetBSD

Commits today:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=6d171e723664bf781016b53125a6515e9c05e2e6

2016-12-21

I've committed a hack to disable path with TLS in libstdc++ for std:call_once.

I also altered the functions mentioned yesterday and initially adapted for
NetBSD.

Everything works the same way like yesterday.. I will try to run tests for
lldb-server tomorrow and maybe this way catch a useful way to narrow down
issues.

In general what's broken:
 - thread handling,
 - process handling,
 - signal handling,
 - breakpoints,
 - PT_STEP/PT_CONTINUE.

The only thing that wors is catching the return value of an exited program.

Commits today:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=6541ee7593c09d25c31e52ef572054679e0699a0
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=5a5e04c2a427d23d1c52230a4692422559e6b805
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=08661026df4a48ae013e4b3f8b8268c87cd53dd1
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=6f4441e65f15f7fc96befbd71642397cc26913db
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=03418e25028342c1d940c7887883f2a9338425d2

http://mail-index.netbsd.org/source-changes/2016/12/21/msg080075.html

2016-12-22

I was researching tests today. So far check-lldb is the most useful test-suite.

Results:

===================
Test Result Summary
===================
Test Methods:       1197
Reruns:                1
Success:             252
Expected Failure:     20
Failure:             307
Error:               162
Exceptional Exit:      1
Unexpected Success:    1
Skip:                436
Timeout:              18
Expected Timeout:      0

 -- http://netbsd.org/~kamil/lldb/check-lldb-r289997-2016-12-22.txt

I was trying to check some particular tests but they don't seem false-positives
so far. They fail mostly because that not much useful works.

I also gave the restuls now to have some kind of starting point and see
progress with growing number of successful tests.

Commits today:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=f977e5336d9636443afd3b8d6a2d9373e4469dcc

2016-12-23

Today I started to understand relations between processes and threads in lldb.

I also learnt about SIGCHLD, as an alternative way to catch signals without
wait(2)-like functions.

I was working on getting signals catchable from a child.. however my result is
regression and that indicates lack of reporting SIGSTOP on attach to tracee.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/12/24/msg080141.html

https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=f4564b21eefeb0a520de92a11a21d836ece1e2b1
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=8c3e603c0939698197ec2b9201c506d200e72a30
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=dfbb17991112ebfb3c5ad4e28eface7b999af576
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=853bddaee755fa719bf920f03aa6b145b4ea8728
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=511aa0d56ff444e0153da2fe225b4200e00cd2c5
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=e7fa627d3aaae6e4218e073fb3ecc0a9ee5a51b0
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=3c051e2b755bfdccb4995ef68846e9eb40b78e50

2016-12-24

I'm still wondering why is NativeProcessNetBSD::LaunchInferior() wrong..

Function Process::WaitForProcessStopPrivate:

StateType
Process::WaitForProcessStopPrivate(EventSP &event_sp,
                                   const Timeout<std::micro> &timeout) {
  StateType state;
  // Now wait for the process to launch and return control to us, and then
  // call DidLaunch:
  while (true) {
    event_sp.reset();
    state = GetStateChangedEventsPrivate(event_sp, timeout);
    if (StateIsStoppedState(state, false))
      break;
    // If state is invalid, then we timed out
    if (state == eStateInvalid)
      break;
    if (event_sp)
      HandlePrivateEvent(event_sp);
  }
  return state;
}

Never receives state changed to stopped from the NetBSD Process Plugin.... this
is odd.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/12/24/msg080154.html

2016-12-25

I have fixed NativeProcessNetBSD::LaunchInferior()... culprit was lack of
tracking thread-stop property in NativeThreadNetBSD. Generic LLDB layers call
NativeThreadNetBSD::GetStopReason and it must return sane (SIGSTOP) value.

I also added there more code for siginfo_t handling cases from NetBSD.. however
currently we cannot receive it from tracee. wait6(2)-like functions get
artifical siginfo_t with SIGCHLD type.. it's not helping.

hello.c on FreeBSD (lldb38):
(lldb) r
Process 51024 launching
Process 51024 launched: './hello' (x86_64)
Hello world!
Process 51024 exited with status = 0 (0x00000000) 
(lldb)

and on NetBSD:
(lldb) r
Hello world!
Process 14106 launched: './hello' (x86_64)
Process 14106 exited with status = -1 (0xffffffff) lost connection
(lldb)

I've a suspiction that the process is running and exiting before proper
initialization and it dies before querying for exit(2).

Commits today:
http://mail-index.netbsd.org/source-changes/2016/12/26/msg080187.html
http://mail-index.netbsd.org/source-changes/2016/12/26/msg080194.html

https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=7b6064141c3282c975febcd0b28e9b942d6dd338
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=edc2b862930e783d499670f36298ceb201c2ae85
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=f9cd8c4f659bf6dcd3133b24f00b54a49be9c1ad
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=185395cf6ae7932ff23e62faa3b897ebed034ce7

2016-12-26

I finally decided that there is required feature to read siginfo_t from tracee.
I've implemented a preliminary version:

http://netbsd.org/~kamil/patch-00026-pl_siginfo.txt

Pity, now kernel is broken due to new npf(4) code nad I will resume testing it
tomorrow.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/12/27/msg080229.html
http://mail-index.netbsd.org/source-changes/2016/12/27/msg080230.html

2016-12-27

I verified that my code for siginfo_t works. Christos gave me some pointers and
I will work on getting the interface to access siginfo_t in read/write mode
into proper shape.

New patch for siginfo:
http://netbsd.org/~kamil/patch-00026-pl_siginfo.2.txt

Local test:
http://netbsd.org/~kamil/info.c

I mailed tech-toolchain@ with my goals:

I would like to set the following goals for the first milestone with LLDB:
1. Add support to peek and poke siginfo_t that stopped tracee, document,
add ATF tests, integrate "peek" action with LLDB,
2. Add support to detect and distinguish by tracer the following events:
exec, clone, fork, vfork, vfork finished (resume parent), (exit?, posix
spawn?,) lwp creation, lwp termination, single step, breakpoint,
hardware trap (breakpoint/watchpoint); document, add ATF tests,
integrate with LLDB.

By LLDB, I mean the code in pkgsrc-wip/lldb-netbsd. By integration with
LLDB, I mean to detect appropriate event in the MonitorCallback code
(part of Process NetBSD) and adapt other functions where needed. Planned
documentation and ATF tests are restricted to the ptrace(2) interface,
their are not about LLDB.

After finishing the above tasks, I plan to sync up with the LLDB team
and work towards implementation of accessors for CPU registers,
breakpoints, watchpoints and other basic functionality required to make
LLDB usable as a (restricted) debugger on NetBSD.

For the last milestone with LLDB, I plan to send upstream NetBSD
support, add 32-bit x86 support, import to base (in the LLVM target) add
ATF tests verifying LLDB.


 -- http://mail-index.netbsd.org/tech-toolchain/2016/12/28/msg002896.html

Commits today:
http://mail-index.netbsd.org/source-changes/2016/12/27/msg080229.html
http://mail-index.netbsd.org/source-changes/2016/12/27/msg080230.html

2016-12-28

Today I faced again this debt with broken std::call_once in libstdc++... I
rebuilt distribution and packages for NetBSD current 7.99.53... however I
forgot to apply the local patch for libstdc++ to stop using TLS.. without this
there is no way to use LLDB on NetBSD.

I will reschedule upgrade of my computers till getting LLDB stage I done.

I synchronized the documentation for siginfo(2) with the current reality.

I also rimarked that there is si_trap value specific to SIGTRAP... so perhaps
it's better place to put there additional values in case of hardware assisted
breakpoint or watchpoint... instead of adding new si_code value, next to
TRAP_BRKPT and TRAP_TRACE. Side effect is that these values in si_trap might be
used in linux_compat(8) code.

I really wish to have more time and ability to design hardware assisted traps
for debuggers with all other Tier I ports.

So, plan for tomorrow - resurrect patch for libstdc++, design siginfo_t patch
for the kernel - add ATF tests, documentation - publish, integrate with LLDB
and move on to exec/clone/fork/vfork/vfork finished/etc.

I wish there were volunteers to fix that std::call_once.. it's good that I want
to import it to base just for LLVM where libc++ works better for this call.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/12/28/msg080313.html
http://mail-index.netbsd.org/source-changes/2016/12/28/msg080314.html
http://mail-index.netbsd.org/source-changes/2016/12/28/msg080315.html
http://mail-index.netbsd.org/source-changes/2016/12/28/msg080316.html
http://mail-index.netbsd.org/source-changes/2016/12/28/msg080317.html

2016-12-29

I learnt that sparc and sparc64 cannot call PT_SETFPREGS prior using FPU.. this
is why ATF tests fail on these ports.. and perhaps other ports are affected for
the same reason.

Besides it I upgraded kernel for NetBSD-current... and it panics for some
reason on signals, I'm unsure whether this is caused by my code or not..

I have prototyped a code for peeking and poking siginfo_t from/to lwp.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/12/29/msg080348.html

2016-12-30

I switched temporarily from siginfo_t handling to VFORK and VFORKDONE.. and it
happened that implementing it is much more difficult that I expected. It almost
is implemented on hacks in the current version for FORK.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/12/30/msg080360.html

2016-12-31

I updated TODO.ptrace and prepared myself to resume work with New Year.

Commits today:
http://mail-index.netbsd.org/source-changes/2016/12/31/msg080383.html

2017-01-01

I finally modeled the final state of PT_[GS]ET_SIGINFO.

http://netbsd.org/~kamil/patch-00026-pl_siginfo.3.txt

GET seems to more or less work, SET panics the kernel. I will resume it
tomorrow.

Commits today not relevant to the project:
http://mail-index.netbsd.org/source-changes/2017/01/01/msg080393.html

2017-01-02

I verified that GET is functional, however there are odd issues with si_code
values for si_code-less variations as this value is set to.. 0xffffffff, not
0x7fff like LWP_NOINFO.

This day I was also focused on rebasing my code to new kernel.. it took a while
as there was general flex upgrade and it caused breakage in several code
fragments.

SET is still broken.

Commits today:
http://mail-index.netbsd.org/source-changes/2017/01/02/msg080439.html
http://mail-index.netbsd.org/source-changes/2017/01/03/msg080460.html

2017-01-03

OK, PT_GET_SIGINFO is now fixed - panics are resolved and I had fun with
searching for location in the code where to inject new faked siginfo_t.. I've
finally found it in postsig() in src/sys/kern/kern_sig.c:

/*
 * Commit to taking the signal before releasing the mutex.
 */
action = SIGACTION_PS(ps, signo).sa_handler;
l->l_ru.ru_nsignals++;
if (l->l_sigpendset == NULL) {
   /* From the debugger */
   if (!siggetinfo(&l->l_sigpend, &ksi, signo))
       (void)siggetinfo(&p->p_sigpend, &ksi, signo);
} else
    sigget(l->l_sigpendset, &ksi, signo, NULL)


My panic in PT_SET_SIGINFO was related to lwp_delref(), it cannot be called
with mutex held.

I was also talking with Pavel from Google and we finally will not go for a
shared process plugin as there is little code to be shared out there.

I've posted my code of the SIGINFO interface:
http://mail-index.netbsd.org/tech-kern/2017/01/04/msg021461.html

Commits today:
None

2017-01-04

I've committed tests for PT_[GS]ET_SIGINFO. I was also trying to determine the
switch from std::call_once in LLDB to llvm::call_once, however it's not that
trivial so I will reschedule it for later. Apparently there are no users of it
out of llvm so far (checked clang and lldb).

So far nobody is interested in review in PT_[GS]ET_SIGINFO.

Commits today:
http://mail-index.netbsd.org/source-changes/2017/01/04/msg080511.html

2016-01-05

I've prepared the code in lldb-netbsd for the addition for PT_[GS]ET_SIGINFO.

Commits today:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=3624a36a80b4378a47810af525d7aeb00b294ef4
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=d07620af079c638afe6d5f5cb4e1ad08799feb63

2016-01-06

I've committed code for TRAP_EXEC to monitor execve(2)-like events and the
famous API for siginfo_t: PT_SET_SIGINFO and PT_GET_SIGINFO.

It bumped the kernel API to 7.99.56.

I've added needed documentation, test for TRAP_EXEC.. and everything so far
seems to work fine.

So on the roadmap in monitor function in LLDB:
 - fork(2) - handled
 - execve(2) (TRAP_EXEC) - handled
 - program breakpoint (TRAP_BKPT) - handled
 - single step (TRAP_TRACE) - handled
 - stop from LLGS (LLDB-GDB-SERVER) - handled

By handled I mean to determine event in code path that is doing nothing.

TODO:
 - vfork(2)
 - vfork(2) parent resumed
 - LWP creation
 - LWP termination
 - hardware assisted breakpoint/watchpoint

I want to handle vfork(2) and LWP with EVENT_MASK and PROCESS_STATE.
For now fork(2) and vfork(2) seem unused in the Linux Process Plugin, however
I don't want to add LWP handling.. before vfork(2).. as there is already
fork(2).

It will look much cleaner to get
PTRACE_FORK 1
PTRACE_VFORK 2
PTRACE_VFORKDONE 4
PTRACE_LWPCREATE 8
PTRACE_LWPEXIT 16

Rather than FORK, LWP and at the and VFORK entries..

LWP ones are crucial in order to handle properly hardware assisted watchpoints
and in general monitor properly threads in LLDB.

For now I'm unsure about clone(2) and posix_spawn(2) handling, I will research
it finally once fork(2) and vfork(2) will be there. It looks like these calls
are special cases of fork(2) or/and vfork(2) ones.

Commits today:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=5954833e6c75e46fa86e81abe7b26f8ddf54a54c

http://mail-index.netbsd.org/source-changes/2017/01/06/msg080572.html
http://mail-index.netbsd.org/source-changes/2017/01/06/msg080573.html
http://mail-index.netbsd.org/source-changes/2017/01/06/msg080574.html
http://mail-index.netbsd.org/source-changes/2017/01/06/msg080575.html
http://mail-index.netbsd.org/source-changes/2017/01/07/msg080576.html
http://mail-index.netbsd.org/source-changes/2017/01/07/msg080577.html
http://mail-index.netbsd.org/source-changes/2017/01/07/msg080578.html
http://mail-index.netbsd.org/source-changes/2017/01/07/msg080579.html
http://mail-index.netbsd.org/source-changes/2017/01/07/msg080582.html
http://mail-index.netbsd.org/source-changes/2017/01/07/msg080584.html
http://mail-index.netbsd.org/source-changes/2017/01/07/msg080585.html
http://mail-index.netbsd.org/source-changes/2017/01/07/msg080586.html

2016-01-07

I've got added initial local code for accessors to support PTRACE_VFORK and
PTRACE_VFORK_DONE.

At the moment I'm unsure what's the appropriate way to store pid for vfork
and vfork done event. For now I've added two pid_t structures to mimic the FORK
event handling, adding one dedicated entry for VFORK and the other for
VFORK_DONE.

Commits today:
http://mail-index.netbsd.org/source-changes/2017/01/08/msg080646.html

2016-01-08

I've added more code for VFORK and VFORK_DONE. I've decided to stop handling
clone(2)/__clone(2) with a special event.. there is no point as it's just a
wrapper to fork(2) and vfork(2) in terms of spawning new child and how is it
handled. There are two ways:
 - spawn child and let parent and child go
 - spawn child and wait fill its termination and resume afterwards

fork(2) is the first, vfork(2) the second

clone(2)/__clone(2) might be one or the other - there is really no point to
determine. I don't see a reason to care if file descriptors, file system
information or such by a debugger - we just care about the spawning and code
running flow.

The same is about posix_spawn(2)/posix_spawnp(2).

All these calls are even handled by the same fork1(9) function inside the
kernel, furthermore from API point of view it's just a matter of convenience
to use fork(2) [+ execve(2)] or posix_spawn(2) or such.

I've added more code and I think currently the only missing part is to inject
generation of SIGTRAP in proper places for vfork and vfork_done. I have got an
impression that I'm generating TRAP_EXEC on VFORK event for child. It's still
fine. As it's still exec() and we can distinguish it from the process creation
with PT_GET_PROCESS_STATE.

I'm thinking about adding TRAP_CHLD for all the use-cases from parent point
of view: forked, vforked, vforkfinished.

I'm also thinking how to implement LWP events, looking at FreeBSD we mostly
care about thread birth and termination; other data like waiters, part, unpark
is untrivial to be exported with ptrace(2) and later handled in a debugger..
it's much easier to use pthread_dbg(3) for this purpose.

For LWP I'm evaluating SIGTRAP & TRAP_LWP, and new interface within EVENT_MASK.
The thing that bothers me is whether extending ptrace_state to lwpid_t
pe_lwp_event can be done with backward compatibility. Thankfully both types are
int32.

                         typedef struct ptrace_state {
                                 int     pe_report_event;
                                 pid_t   pe_other_pid;
                         } ptrace_state_t;


I want to keep pe_lwp there as in case of really many LWPs out there like
thousands detecting which thread was created with PT_LWPINFO will be too
complex operation to perform quickly. With pe_lwp it's O(1) from a debugger
point of view.

Commits today:
http://mail-index.netbsd.org/source-changes/2017/01/09/msg080677.html

2016-01-09

I've introduced TRAP_CHLD in SIGTRAP and added two tests siginfo5 and siginfo6.

One of the commit messages:
Introduce new si_code for SIGTRAP: TRAP_CHLD - process child trap

The SIGTRAP signal is thrown from the kernel if EVENT_MASK (ptrace_event)
enables PTRACE_FORK. This new si_code helps debuggers to distinguish the
exact source of signal delivered for a debugger.

Another purpose of TRAP_CHLD is to retain the same behavior inside the
NetBSD kernel for process child traps and have an interface to monitor it.

Retrieving exact event and extended properties of process child trap is
available with PT_GET_PROCESS_STATE.

There is no behavior change for existing software.

This si_code value is NetBSD extension.

Sponsored by <The NetBSD Foundation>


siginfo5 checks siginfo_t in fork(2) flow with monitoring events
siginfo6 checks siginfo_t for PT_STEP - to validate TRAP_TRACE

It was surprising to me that throwing once SIGTRAP in fork1(9) results in
two events in debugger for child and parent.

I was also surprised that raise(2) generates SI_LWP, not SI_USER.

I've rebased my code for vfork and vforkdone... and I'm in process of trying to
understand what's happening there.

TODO:
 - vfork events
 - lwp events
 - hardware assisted watchpoints events
 - lldb basic integration

Commits today:
http://mail-index.netbsd.org/source-changes/2017/01/09/msg080700.html
http://mail-index.netbsd.org/source-changes/2017/01/10/msg080702.html
http://mail-index.netbsd.org/source-changes/2017/01/10/msg080704.html
http://mail-index.netbsd.org/source-changes/2017/01/10/msg080705.html
http://mail-index.netbsd.org/source-changes/2017/01/10/msg080706.html
http://mail-index.netbsd.org/source-changes/2017/01/10/msg080711.html
http://mail-index.netbsd.org/source-changes/2017/01/10/msg080718.html

Also fixed some build issues after merging new zlib(3)
http://mail-index.netbsd.org/source-changes/2017/01/10/msg080712.html
http://mail-index.netbsd.org/source-changes/2017/01/10/msg080714.html
http://mail-index.netbsd.org/source-changes/2017/01/10/msg080715.html

2016-01-10

I updated the ptrace(2) documentation for TRAP_CHLD.

Commits today:
http://mail-index.netbsd.org/source-changes/2017/01/11/msg080828.html
http://mail-index.netbsd.org/source-changes/2017/01/11/msg080829.html

2016-01-11

I've learnt that vfork(2) resums not just on termination of the child, but also
on exec() called by it.

For now my local code for PTRACE_VFORK_DONE looks to work.

I don't fully understand why TRAP_CHLD is reported for PTRACE_FORK..

Today was less coding, as I was organizing a local NetBSD party for few local
NetBSD (knowing) people.

Commits today:
NONE

2016-01-12

I've decided to ask Christos to push vfork_done code now and reschedule the
vfork one for later.. It's not trivial as of now to redesign the code (in other
wors fully understand all the nits) and without deadlocking and failing to emit
sigal on time.

Proposed code:
http://netbsd.org/~kamil/vforkdone.1.txt

Good information is that fork(2) and vfork(2) are partly supported in LLDB, and
it's not crucial feature from the LLDB point of view. I wanted it mostly for
the sanity of API to support FORK VFORK events and later other ones. I plan to
start working tomorrow on LWP birth and exit.

Commits today:
http://mail-index.netbsd.org/source-changes/2017/01/12/msg080919.html

2016-01-13

I've committed PTRACE_VFORK and PTRACE_VFORK_DONE code.

I also started and finished PTRACE_LWP_CREATE and PTRACE_LWP_EXIT, it has been
committed.

Commits today:
http://mail-index.netbsd.org/source-changes/2017/01/13/msg081009.html
http://mail-index.netbsd.org/source-changes/2017/01/13/msg081010.html
http://mail-index.netbsd.org/source-changes/2017/01/13/msg081011.html
http://mail-index.netbsd.org/source-changes/2017/01/13/msg081012.html
http://mail-index.netbsd.org/source-changes/2017/01/14/msg081013.html
http://mail-index.netbsd.org/source-changes/2017/01/14/msg081021.html
http://mail-index.netbsd.org/source-changes/2017/01/14/msg081026.html
http://mail-index.netbsd.org/source-changes/2017/01/14/msg081029.html
http://mail-index.netbsd.org/source-changes/2017/01/14/msg081030.html
http://mail-index.netbsd.org/source-changes/2017/01/14/msg081031.html
http://mail-index.netbsd.org/source-changes/2017/01/14/msg081032.html

2016-01-14

I fixed a bug in PTRACE_LWP_EXIT and enhanced tests for it. This event was
reported as PTRACE_LWP_CREATE.

I noted that perhaps I shouldn't use si_trap for hardware assisted watchpoints,
as this field is already used - for not much interesting field for a debugger,
at least on amd64 - x86 trapno, and it maps exactly on breakpoint or trace,
adding no extra information that si_code. Interesting that there is already
si_trap2 and si_trap3 unused, I can use si_trap2 for the number of trap that
fired and si_trap3 for watchpoint-specific data, in x86 case to note whether
the trap was on single-step&watchpoint or just watchpoint.

I also added locally some changes to the API:
 - pw_type added to ptrace_watchpoint
 - new SIGTRAP si_code TRAP_HWWPT
 - start counting watchpoints from 1 -- to be bit more safe

I need to utilize si_trap2 and si_trap3 and refactor existing code for
watchpoints to be more flexible for new use-cases.

Commits today:
http://mail-index.netbsd.org/source-changes/2017/01/14/msg081046.html
http://mail-index.netbsd.org/source-changes/2017/01/14/msg081047.html


2016-01-15

I've committed TRAP_HWWPT to <sys/siginfo.h> and added initial HISTORY section
in ptrace(2).

Commits today:
http://mail-index.netbsd.org/source-changes/2017/01/15/msg081100.html
http://mail-index.netbsd.org/source-changes/2017/01/15/msg081103.html

2016-01-16

I've altered the interface for ptrace_watchpoint and enabled them x86
watchpoints in combination with sstep hit. I'm still working on proper
detection of occurred events. I should finish it by tomorrow.

Commits today:
http://mail-index.netbsd.org/source-changes/2017/01/16/msg081151.html
http://mail-index.netbsd.org/source-changes/2017/01/16/msg081152.html
http://mail-index.netbsd.org/source-changes/2017/01/16/msg081153.html

2016-01-17

I've pushed new functions in ptrace_watchpoints.

Embed hardware trap and its type that fired (x86), information for tracers

Now x86 throws SIGTRAP on hardware exception with:
 - si_code TRAP_HWWPT - dedicated for hw assisted watchpoint interface
 - si_trap - unchanged (T_TRCTRAP)
 - si_trap2 - watchpoint number that fired
 - si_trap3 - watchpoint specific event description

x86 returns in si_trap3 one of the field from <x86/dbregs.h>
 - X86_HW_WATCHPOINT_EVENT_FIRED - watchpoint fired
 - X86_HW_WATCHPOINT_EVENT_FIRED_AND_SSTEP - watchpoint fired under PT_STEP

Othe changes:
 - restrict more code from <x86/dbregs.h> to _KERNEL

Sponsored bt <The NetBSD Foundation>

Use siginfo_t to validate tests/kernel/arch/amd64/t_ptrace_wait*

This change makes sure that the fired expected watchpoint with expected
property. It's done with PT_GET_SIGINFO and checking SIGTRAP codes.

Remove assert that Debug Registers are not mixed with Debug Trap Flag

New code is designed to mix them.

Sponsored by <The NetBSD Foundation>

Commits today:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=b8495522e71121338c00e203df0be8b467148315

http://mail-index.netbsd.org/source-changes/2017/01/18/msg081172.html
http://mail-index.netbsd.org/source-changes/2017/01/18/msg081173.html
http://mail-index.netbsd.org/source-changes/2017/01/18/msg081174.html

2016-01-18

I've found that an event like hardware watchpoint fired and single-step
occurred in the same time cannot happen.. as watchpoint is fired before
execution of an instruction and sstep is fired.. after it. This pushes me
forwards PT_SETDBREGS and PT_GETDBREGS and handle all this in a debugger. There
is also additional complexity that there are different generations of x86 cpus
and they offere different kind of events.. I will switch this interface later
to PT_SETDBREGS and PT_GETDBREGS. Going back and forth looks back.. however it
might be the cost of researching the best possible solution. On the other hand,
I have added needed interfaces to handle hardware watchpoints like LWP events.
These events will be needed to set in a per-thread manner dbregs within a
debugger.

I've updated lldb-netbsd/TODO to:
$ cat TODO                                                                                                                                          
Debugging to a file:
    log enable -STagnpstv -f /tmp/log.txt lldb all

Introduce objc++ setup with gcc(1) for "make test":
    Build Command Output:
    g++: error trying to exec 'cc1objplus': execvp: No such file or directory
    g++: error trying to exec 'cc1objplus': execvp: No such file or directory
    gmake[4]: *** [main.o] Error 1

llvm::call_once hack for src:
   curl https://github.com/jsonn/src/commit/78f4ee4c8349d68cf2279f2c7fc2196ae369e182.patch|gpatch -R -p1

The current milestone is to detect in Monitor Callback the following events:
 - process termination - works, appropriate return status passed to LLDB
 - execve(2) - detected but process hangs
 - software breakpoint (TRAP_BRKPT) - detected but process hangs
 - single step - detected but process hangs
 - fork - not detected (hangs earlier?)
 - vfork-done - detected but process hangs
 - lwp creation/termination - detected but process hangs
 - hardware breakpoint - currently not easily testable (skipped)
 - other signal passed to tracee - detected but process hangs

Short-term goals in next milestone:
 - fix conflict with system-wide py-six
 - add support for auxv read operation
 - switch resolution of pid -> path to executable from /proc to sysctl(7)
 - recognize Real-Time Signals (SIGRTMIN-SIGRTMAX)
 - upstream !NetBSDProcessPlugin code
 - switch std::call_once to llvm::call_once

To be done later:
 - registers' accessors
 - single step support
 - thread resume/suspend operation
 - i386 support
 - upstream NetBSD support
 - adapt upstream Python tests to run on NetBSD and pass as many of them as
   possible
 - import LLDB into base
 - add NetBSD specific ATF tests verifying fundamental functionality of LLDB

and of course fix as many bugs as possible

Commits today:
http://mail-index.netbsd.org/source-changes/2017/01/18/msg081181.html

https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=05beb8ad6bf411f197f2d2572ff9620648dd829c
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=2caf5a63f258f8ae0cd38ba50ec5a09084e8ac2a
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=b203b2f74ee13e756b5dac8c0f1587bdaf887cc8
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=fa0503e23b5a9b0b89b10e20e06cca8ff6df9608
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=b5fbe761b8cfd06a7e057af105903350c777873e
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=237957edccc65a15e97d29c834f7a69202d0d7c9
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=3ea6e80a383496daf7a9c9e0d5c0a923d18b396a
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=9282dde8171cf1d9441ba6d38645d75a0d59e620
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=e2aa6321fe80405c5d3f9d8bddd802c22507f4f9
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=149def4215b894a97a9bcdc06120d6381237335b

2016-01-19

It appears that I will focus now just on handling "other signal" raised, as
other cases have extra logic I don't want to do now.

Commits today:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=aa14e8a89a4573ddc233d98bcd6514e090e681c1

2016-01-20

This is the best summary of today's progress as in written in a commit message:

Old report:

The current status is that in Linux process=thread and each
thread needs to be spawned or suspended separately. The code for remote
debugging is designed after Linux model and for NetBSD, we need to mimic
that there is single thread for certain interfaces (I've discussed it
with LLDB developers) - matching our concept of Process. Linux has code
to step or resume a process in the NativeThreadLinux part, we need to
call it per-process basis.

The action of Signal Monitor was ignored, as a tracee was marked as
Stopped after attaching (Launching -> Stopped). In the code to resume
it, I was just calling PT_CONTINUE without altering the status of tracee
(to Running or Stepping) and using ResumeAction list (it contains signal
to be passed). I discussed the proper design for NetBSD and our code for
it should live in NativeProcessNetBSD (not in NativeThreadNetBSD).

New report:

$ lldb
(lldb) process connect connect://localhost:1234
Process 29742 stopped
* thread #1, stop reason = The signal Stopped (signal) was caught
    frame #0:
(lldb) c
Process 29742 resuming
Hello world!
Process 29742 stopped
* thread #1, stop reason = The signal was generated via _lwp_kill(2) from pid=29742, uid=1000
    frame #0:
(lldb) c
Process 29742 resuming
Process 29742 exited with status = 0 (0x00000000)
(lldb)


Commits today:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=9d2fddfc221e76e1aeae0099c4bda9822d2247eb
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=cef5e7e2c3ac153dacf068a27288662ec3e69432
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=fd83895ef236de9f9c6c41140aca3fd2e6960b54

2016-01-21

Today I got functional handling LWP creation/termination, software breakpoints,
and partly single-step.

I've decided that the Stage I for LLDB is done and I'm progress of writing a
summary for TNF blog.

Commits today:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=4a0275d66c89cb2f10912027b3e737014a68bf0c
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=a57f58bcc17805675203f9fb31d45cb798258648
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=b5e41a011c9507e6149a63cfddddd324e60c8cd5
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=61784f9e869f386ef907deee717dfd092187d56d
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=7aab540f6e7c38698ee67273c7ccbdde836bddae
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=31a00d3389dae13474514d68de17bc6f0090bc4f
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=efe5087cd7be3e7c9da8e61bb881b63f8c35c8a6
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=91b792dee0b6ee6e645dfe9391d1a9eddcf82a1e

2016-01-22

All day long I was working on blog entry... it's not finished yet.

Commits today:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=b6f36a095d775699146c81a244fa682b44e31da1

2016-01-23

I've finished the blog entry and posted it to TNF blog.

http://blog.netbsd.org/tnf/entry/summary_of_the_preliminary_lldb

It was propagated to other sites, like:

https://www.phoronix.com/scan.php?page=news_item&px=NetBSD-LLDB-Progress

And social media (Twitter,...)

I've updated my work machines to the recent userland and kernel, and rebuilt
all the packages I need. I'm ready for more more work tomorrow. Today I
switched NetBSD specific code to determine executable path, from resolving
paths from /proc to the sysctl(7) interface.

Commits today:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=920bfa7dbbe90a4203c1b1e04890cf98e415e421

2016-01-24

I've checked that the patch for sysctl(7) function translating pid to path name
is working correctly and I submitted it to review.

https://reviews.llvm.org/D29089

I've also extended recognition of symbols to the real-time ones. This patch is
also on review.

https://reviews.llvm.org/D29091

Commits today:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=4fe73191260dac40b7eda078e0f21c52f6074326
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=70a317ec39a2e92f9fc23d9951d9fc75564fdae6

2017-01-25

I was working on elf auxiliary vector determining what is it and how to use
PIOD_READ_AUXV in PT_IO. I have filed a PR and when I took back home, it was
fixed by Christos. There popped out more bugs and Christos shaked them off,
mostly with dynamic modules.

I got in a private mail feedback from Chuck Silvers that a tracee can mask some
signals, or all of them - it breaks DTrace as traps (breakpoints) are silently
ignored. It works correctly on Linux and FreeBSD, while NetBSD and OpenBSD are
broken.

Plan for tomorrowa is to address the std::call_once issue. Once there, I will
implement AUXV reader.

I've filed two bugs:
http://gnats.netbsd.org/51916 ptrace(2) PT_IO option PIOD_READ_AUXV returns large piod_len on exit
http://gnats.netbsd.org/51918 Tracee can prevent tracer to get its signals by masking

Commits today:
http://mail-index.netbsd.org/source-changes/2017/01/25/msg081342.html
http://mail-index.netbsd.org/source-changes/2017/01/25/msg081343.html
http://mail-index.netbsd.org/source-changes/2017/01/26/msg081349.html
http://mail-index.netbsd.org/source-changes/2017/01/26/msg081350.html

2017-01-26

I was researching today what's the impact of signal masking on FreeBSD and
Linux when comparing with NetBSD - from the point of view of a debugger. In
general masked signal is blocked for tracee and tracer if it's explicitly
specified; if it goes from the kernel it's not to be masked and must be caught
by a debugger. Without this debugers are incapable to monitor a process.

Commits today:
http://mail-index.netbsd.org/source-changes/2017/01/26/msg081368.html
http://mail-index.netbsd.org/source-changes/2017/01/26/msg081370.html
http://mail-index.netbsd.org/source-changes/2017/01/26/msg081374.html
http://mail-index.netbsd.org/source-changes/2017/01/26/msg081375.html
http://mail-index.netbsd.org/source-changes/2017/01/27/msg081376.html

2017-01-27

I've pushed remaining tests for signals in t_ptrace_wait*.

I also finally managed to build llvm::call_once in LLDB code, I had to go for
"using namespace llvm", without it GCC outputs cryptic error messages.

Commits today:
http://mail-index.netbsd.org/source-changes/2017/01/27/msg081393.html
http://mail-index.netbsd.org/source-changes/2017/01/27/msg081394.html
http://mail-index.netbsd.org/source-changes/2017/01/27/msg081395.html

https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=1118be2babd6626a70fc5f26217c25924ddc77c6
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=0c7755fb8c0f44886079b7b031ac0bccfff8064d
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=45c49f99e4383097911a8aee3c48bb450376e235

2017-01-28

Today, I planned to push upstream std::call_once switch to llvm::call_once...
however building it wasn't that trivial. Things like to break with cryptic
build errors.

On the other hand I pushed a simpler patch upstream for variadic arguments:
 - https://reviews.llvm.org/D29256
  "Do not pass non-POD type variables through variadic function"

I've committed two patches for LLDB upstream:
 - switch resolution of pid -> path to executable from /proc to sysctl(7)
   https://reviews.llvm.org/D29089
   committed as SVN revision 293392
 - recognize Real-Time Signals (SIGRTMIN-SIGRTMAX)
   https://reviews.llvm.org/D29091
   committed as SVN revision 293391

Commits today:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=a3f2f332b5b7767fdefdc285bbd77b92019f9676
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=7cc4886dba9626face0957aaea677a587084d5ca
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=7d9c2050465b0424e66b0e3ce658f7058b8a23af
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=88911656a31dd9a3dfdcb03615f492f6d6f5eeb9
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=c72bb4bb8781b055f0324bcf3540de039939cc90
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=4f0b40525e0a4ffdb18465f8778ec1740b4ba7cb
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=ddd2b64913d59131702684c102728cf711554867
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=7b8362be817241def2d9aa91aaaf45d096ce2566

2017-01-29

I've pushed two more patches to review to LLDB:
 - https://reviews.llvm.org/D29264 "Add NetBSD support in Host::GetCurrentThreadID"
 - https://reviews.llvm.org/D29266 "Synchronize PlatformNetBSD with Linux"

Commits today:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=ce4ec571b152a710595fb6a16285481c297819c7
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=928fa4966a7d0f6752e13835b08ad0961fb46c75
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=284e69a9ce209b60c7501b343e14979cdc1117b0
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=6ebe2f84751fe90d542d9c11a0f3eb6e0e41336c

2017-01-30

I've submitted more patches to review to LLDB:
 - https://reviews.llvm.org/D29288 "Switch std::call_once to llvm::call_once"

LLVM ones:
 - https://reviews.llvm.org/D29296 "Make llvm::call_once more convenient to reuse out of LLVM"

The other ones got more review and discussion.

Two of them seem to be ready:
 - https://reviews.llvm.org/D29264 "Add NetBSD support in Host::GetCurrentThreadID"
 - https://reviews.llvm.org/D29256 "Do not pass non-POD type variables through variadic function"

I was also working additionally on LLD the LLVM linker a bit to refresh a
little bit my mind and change context to something new / other. It mostly works
in terms of building, but it's broken in terms of working..

$ clang -fuse-ld=lld test.c -v
clang version 5.0.0
Target: x86_64-unknown-netbsd7.99.59
Thread model: posix
InstalledDir: /usr/pkg/bin
 "/usr/pkg/bin/clang-5.0" -cc1 -triple x86_64-unknown-netbsd7.99.59
-emit-obj -mrelax-all -disable-free -disable-llvm-verifier
-discard-value-names -main-file-name test.c -mrelocation-model static
-mthread-model posix -mdisable-fp-elim -masm-verbose
-mconstructor-aliases -munwind-tables -target-cpu x86-64 -v
-dwarf-column-info -debugger-tuning=gdb -resource-dir
/usr/pkg/bin/../lib/clang/5.0.0 -fdebug-compilation-dir /tmp
-ferror-limit 19 -fmessage-length 190 -fobjc-runtime=gnustep
-fdiagnostics-show-option -fcolor-diagnostics -o /var/tmp/test-25a749.o
-x c test.c
clang -cc1 version 5.0.0 based upon LLVM 5.0.0svn default target
x86_64-unknown-netbsd7.99.59
#include "..." search starts here:
#include <...> search starts here:
 /usr/pkg/bin/../lib/clang/5.0.0/include
 /usr/include
End of search list.
 "/usr/pkg/bin/ld.lld" --eh-frame-hdr -dynamic-linker /libexec/ld.elf_so
-o a.out /usr/lib/crt0.o /usr/lib/crti.o /usr/lib/crtbegin.o
/var/tmp/test-25a749.o -lc -lgcc --as-needed -lgcc_s --no-as-needed
/usr/lib/crtend.o /usr/lib/crtn.o
/usr/pkg/bin/ld.lld: error: unable to find library -lc
/usr/pkg/bin/ld.lld: error: unable to find library -lgcc
/usr/pkg/bin/ld.lld: error: unable to find library -lgcc_s
clang-5.0: error: linker command failed with exit code 1 (use -v to see
invocation)

Pushing it forwards would be certainly cool! I've pinged one OpenBSD developer
who was working on LLD patches to help to detect what and where change.

I'm still frustrated that gcc5-aux doesn't build.. there are sume cryptic
issues. I need to address it at some level.

Commits today:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=a74b7ace60bde2ba7e268575d8a43d6c9f6b8b8e
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=51e2b0f6f39c776bf22a10e2666bb1e62c5301fa
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=8c73792e20cb70384d165f959cf55302db13be7d
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=d0382deb7c529db4bc4b321c9e6173b775660bd8
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=aa3472b12b09244fd79138e13cdf2cabe29a20f8
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=8531424945731ab3d4665da86cdb9910075fb861
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=5cf35d6fc22a7f5f5b81161093532e5d6b39daae
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=cf09c31fdb5d2ef7876ba86b763ae247b01c2632

2017-01-31

It was another day on the LLVM-projects review board.

Ready to land:
D29347 Transform ProcessLauncherLinux to ProcessLauncherPosixFork
D29256 Do not pass non-POD type variables through variadic function
D29296 Make llvm::call_once more convenient to reuse out of LLVM

Waiting for feedback:
D29288 Switch std::call_once to llvm::call_once
D29266 Synchronize PlatformNetBSD with Linux

Plan for tomorrow:
 - py-six conflict removal patch
 - llvm::call_once patch improvement
 - auxv reading patch

Commits today:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=39b4bbcbdf3bb2a66415c91cf5f124cfac64f88b
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=3daccb6e29ebdcf2aefededa729ee8da41ce9dcf
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=5e9d39d1671034e91b0fa70b89e3c90e19dc78b0
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=93b422357fbf63564a6214055c7368ace24fc8f1
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=730522294465a41856ecdca895704b0b11ed1fbe
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=a45e01754421ab99b76e331e9f3d2075ab52a4a8
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=90721804b4ee4ea950a22ea5a9ced325d318cdfa
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=4516e6fdb2042d57a1fe21364b8cfd02341cdac8
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=b2800872cb0917f2580c20a97a30391d3149dbed
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=26b32ed526cc47369ce01c34b95ae0eebd02c1b2
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=2006e55d3b0f21db86c878f9f83b93413b290c34
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=5154b3a6c11b14675118bc4c0945e516f1554c2b
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=937b1c476e6346bb74caf428e89b4d64b76117b6
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=412c389bdd7605657841bfe5f2c564dd09759fee

2017-02-01

My plan was rather pretty.. but unrealizable in one day.

I've committed three patches upstream in LLDB:

   "Document that LaunchProcessPosixSpawn is used on NetBSD"
   committed as SVN 293770
   http://llvm.org/viewvc/llvm-project/?view=rev&revision=293770

   "Add ProcessLauncherNetBSD to spawn a tracee"
   renamed to:
   "Transform ProcessLauncherLinux to ProcessLauncherPosixFork"
   https://reviews.llvm.org/D29347
   committed as SVN 293768
   http://llvm.org/viewvc/llvm-project/?view=rev&revision=293768

   "Do not pass non-POD type variables through variadic function"
   https://reviews.llvm.org/D29256
   committed as SVN revision 293774
   http://llvm.org/viewvc/llvm-project/?view=rev&revision=293774

The current status is as follows
================================

Ready to Land:
D29403 Fix multi-process-driver.cpp build on NetBSD

Ready to Update:
D29288 Switch std::call_once to llvm::call_once
D29266 Synchronize PlatformNetBSD with Linux

Waiting on Review:
D29405 Install six.py copy into subdirectory lldb
D29296 Make llvm::call_once more convenient to reuse out of LLVM

Waiting on Authors:
D29406 Unify PlatformPOSIX::ResolveExecutable

Commits today:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=52f4b281c17072e3753213f12cb2ec06ce8706cd
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=ed3276b8c3f09f4b7af669243784c02ca9e74b75
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=e09880953a637c9f948653ae55d46058ca235c1c
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=1710c4dba07d5cc3943c6ba37684aae5a66e89ee
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=367ba7ec3ad701f3c43b2935375218d955097364
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=fe40174a64b74dd2144f18df832de00d033e8949

2017-02-02

Today three patches landed upstream:
D29403 Fix multi-process-driver.cpp build on NetBSD
D29296 Make llvm::call_once more convenient to reuse out of LLVM

and by Pavel Labath a generic change that is helping indirectly NetBSD:

D29406 Unify PlatformPOSIX::ResolveExecutable


I was working on the six.py patch. And it's still in progress, I'm debugging
Python scripts right now, for some reason options aren't propagated to
appropriate files.

I've got also now room for implementation llvm::call_once switch in LLDB.

I wrote today on LLVM review (https://reviews.llvm.org/D29266):

I need to finish three patches:

- llvm::call_once,
- auxv reading,
- six.py conflict removal

and I will join here.

After that, I will switch the PT_WATCHPOINT* interface to PT_GETDBREGS and PT_SETDBREGS -- and in the end, add dbregs support in NetBSD's userdata. Later on, I will need to finish few tasks on the NetBSD side (thread suspend/resume; detect simple false positives in the check-lldb target).


Commits today:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=04ee7c5ec7906f653a6652a6f6334384bed24b3e
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=34b1447bdd51c5f32dd5ca91f5d1f4dc8993ce31
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=be0e12e8ba80036eee6d6d805d56c69994bb9de2
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=ff7f5445fcc74670c9411098da7e38b61bfa99b4
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=e6034f7066ab75b414b66f81c6fba0b93d81ae35
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=d1c3184f008c5a85efa3fda55a14cae4015655ce
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=71559fe1227ed36794296c9bb2df416eb5e63be1
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=927d822fc001b1b26a7bfbbaf93f8fb6898964c2
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=46dd206a13a48c9bc5f642ed324b82feca93f13b

2017-02-03

Today finally six.py started to work and landed the upstream sources.

Pavel Labath reduced duplication in Platform Plugins - he removed few functions
out there:
http://llvm.org/viewvc/llvm-project?rev=294019&view=rev

I was working on llvm::call_once.. something doesn't work appropriately, as
debugger is locked.

The current plan is to finish llvm::call_once and afterwards add code for auxv
reading on NetBSD. I hope to get it done for Monday.

Commits today:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=fd7cc716c1ce280b2f0501a0fbf13e7c7edcf1f0
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=02293042db99203a4491ace64429d957a610ae0e
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=39d820e75003fd2447fa35793d9242ee4d805272
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=2baa872e66bfe63eeacbb907affa17abd48dabf4
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=a514c6393773f78a849303d87bef4284d9c9bba4

http://llvm.org/viewvc/llvm-project?view=revision&revision=294071

2017-02-04

Bigger news.

The remaining patch in "upstream !NetBSDProcessPlugin code" has been
accepted upstream:

 "Synchronize PlatformNetBSD with Linux"
  https://reviews.llvm.org/D29266

I decided to leave the patch for GetName/SetName removal to be abandoned
and rescheduled for later. I don't understand its impact for future in
final version of the Native Process/Thread Plugin.

Patches in "switch std::call_once to llvm::call_once" started to work
correctly (without visible regressions):

  "Switch std::call_once to llvm::call_once"
  https://reviews.llvm.org/D29288

After improving my knowledge on the GDB Remote Process protocol, so far
our new interfaces are very nicely matching the needs for it. There is
one improvement to be done in the field - to distinguish PT_SYSCALL
events - I'm thinking about new si_code values TRAP_SCE (syscall entry)
and TRAP_SCX (syscall exit). This is also required for DTrace (libproc)
on NetBSD. The values are designed after FreeBSD (but contrary not in
ptrace_lwpinfo):

    PL_FLAG_SCE
           The thread stopped due to system call
           entry, right after the kernel is entered.
           The debugger may examine syscall arguments
           that are stored in memory and registers
           according to the ABI of the current
           process, and modify them, if needed.
    PL_FLAG_SCX
           The thread is stopped immediately before
           syscall is returning to the usermode.  The
           debugger may examine system call return
           values in the ABI-defined registers and/or
           memory.

My current plan:
 1. Finish llvm::call_once switch
 2. Commit upstream new Platform Plugin NetBSD
 3. Resume work on verifying the AUXV interface in ptrace(2) and
implement it in LLDB for NetBSD - I'm thinking right now that it should
be truncated with AT_NULL (included) - I need to compare with Linux and
FreeBSD and if applicable add more tests
 4. PT_WATCHPOINTS -> PT_*DBREGS and add appropriate code in LLDB
(dbregs in userdata on NetBSD)
 5. Thread lock/unlock (suspend/resume) ptrace(2) calls
 6. Research new si_codes in SIGTRAP for PT_SYSCALL (primary need in
libproc - DTrace)
 7. Try to improve setup for running "check-lldb"

1.-2.are ready now modulo feedback from upstream
3. is planned to be started tomorrow right away

I'm also planning to document in ptrace(2) porting notes and differences
between NetBSD, FreeBSD and Linux. I noted that people tend to misuse
PT_LWPINFO on NetBSD, expecting the same behavior like in FreeBSD.

Commits today:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=0277dd37e7007f89a9d9b7cb3fc9b9a0a04fa94d
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=ea4943c2328ad1fe095bb05bfed0563c04879626
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=81b2e7b3278e611f591dc33c8d0c67d3b3791fb8
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=da6179f5a742e2883153f8a433c3b7bb32353713
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=2461b2fd253fb5df6f5b8c084eb5a0b2efc2e011
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=ce7cb673b3f8c4190d98bca7e2d043fee3c5ea60
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=9051b4dd6a42bae1de40513b1bfe8658368c6625

2017-02-05

I've prepared and committed a patch in LLVM for new llvm::once_flag.

I also pushed PlatformNetBSD to the LLDB sources.. so right now there is only
single patch pending upstream -- llvm::call_once switch in LLDB. I'm looking
forward to get it tested by other developers.

Applying patches, rebuilding sources etc took so much time that I didn't touch
AUXV code. On the other hand close to 20% of the code from lldb-netbsd patches
disappeared! Once llvm::call_once will go away, I will get quite clean room
with a little number of patches (maybe 10) to maintain.

Commits today:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=07b61aec28c28408d19ad564740bd64d05bb0e2b
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=2eb7ba709b5cc4bdde2b9ece9bc5da52e5027d71
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=f76cd6c6c7bdf796010690379315105006898e4f
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=95b158461aefd501a7591230ad4574ebfca22d60
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=d076a43e85c9118136e70e79b10fe3c0b350d461
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=f22516efb4fccd34de8ace649026436eecdaa0c9
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=099636eacf8cd0547010b052f00d99d65dabeed9

2017-02-06

A patch for AUXV reading has been committed to pkgsrc-wip and this repo updated
with recent SVN revision.

So far I think AUXV is never used at least I haven't triggered this code when
running the test suite. This is the reason why I will keep it downstream for
now.

To sum it up, all the current patches for LLVM and LLDB were upstreamed.

Next tasks:
 - LWP suspend/resume API
 - PT_WATCHPOINT -> PT_*ETDBREGS

Commits today:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=db4d8c9548b407b7f6c6080ee787a88e58943167
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=3b083c779f66ce83a8cf46d52605c8b4f6388fbb
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=ff003a84dfa52fd71e26291ae874dd2d524be3fc
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=0331b2c59be8008ff4185cac12579bd0bcc1398c

2017-02-07

Today I had rather relaxing tasks. I finally obsoleted exect(3) turning it to a
plain execve(2) call with a warning. Another thing I did today was removing
libpthread_dbg(3) from base.

I also started writing PT_SUSPEND and PT_RESUME. I noted that Linux provides
another missing on NetBSD option: PTRACE_GETSIGMASK and PTRACE_SETSIGMASK.

I noted that PTRACE_GETSIGMASK and PTRACE_SETSIGMASK is used for checkpoints
in criu. https://criu.org/Main_Page NetBSD needs that!

Commits today:
http://mail-index.netbsd.org/source-changes/2017/02/08/msg081773.html
http://mail-index.netbsd.org/source-changes/2017/02/08/msg081771.html
http://mail-index.netbsd.org/source-changes/2017/02/08/msg081768.html
http://mail-index.netbsd.org/source-changes/2017/02/07/msg081767.html
http://mail-index.netbsd.org/source-changes/2017/02/07/msg081758.html
http://mail-index.netbsd.org/source-changes/2017/02/07/msg081756.html
http://mail-index.netbsd.org/source-changes/2017/02/07/msg081754.html

https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=5a01106f8819f3b3cfa68450343f835d40f7167e
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=86b79a5f9a2c6e24bffdc32f3d6c2e70bde98006
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=84d03c8f0d0fec830c18bd792d630f489be8515e
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=22cec044e8af0c1ea08bfe5f5e7f56d1bdfeff14
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=b7b2963fa5574f0a2229b9b5d8213836586e12f9
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=10b15cb92e8f3b95aff799f882def13f2422952b

2017-02-08

I've added a dummy update for edb-debugger in pkgsrc-wip. I would use this
debugger for tests of register accessors.

I was checking API for sigmask read/write and apparently there is a way to
check kinfo_proc2 to read it.. but it looks read-only.

I've written a patch adding support for PT_SUSPEND and PT_RESUME on NetBSD, I
need to compare it with with FreeBSD. This OS has different semantics.. but
I need to check how it behaves in general.

Commits today:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=87316558bcd469b2002b8eb3513902e739eac96c
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=b67083fefd6e02ad8c4aae2ee0cc2b5db7f2c92e

http://mail-index.netbsd.org/source-changes/2017/02/08/msg081792.html

2017-02-09

PT_SUSPEND and PT_RESUME are progressing.. it seems to work.. somewhat, but
there are basic cases when it does not.

Commits today:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=0c6c2770e0860425bbf0ba9d1e1fe106556a7a86

2017-02-10

I've published PT_SUSPEND & PT_RESUME code on the mailing list.

I'm proposing an API to restore the functionality to resume or suspend a
specified thread from execution.

This interface was implemented in the past in user-space inside
pthread(3) with the M:N thread model (with help from removed pthread_dbg).

http://netbsd.org/~kamil/patch-00028-pt_suspend-pt_resume.txt

This code is close to FreeBSD and shares the same request names
(PT_RESUME and PT_SUSPEND), however on NetBSD we pass the full pair of
tracee's pid_t and thread's lwpid_t. FreeBSD specifies just thread ID,
which is insufficient on NetBSD, as a single tracer can control multiple
tracees and face duplicated lwpid_t.

I've added an interface to detect if a specific LWP has been suspended
(or not) with extending the PT_LWPINFO interface with a new pl_event
value PL_EVENT_SUSPENDED (next to PL_EVENT_NONE and PL_EVENT_SIGNAL).

There is a new check preventing deadlocks and ptrace(2) can set with
this patch new errno EDEADLK. I haven't checked the existing code but it
appears that we can deadlock tracee with current PT_CONTINUE and friends.

Commits today:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=b3dfd4974f21f282971f90316be194892653a395
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=4cba342d28b79cfdc97feb3307c3b4efe8b3189f
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=716947aa5710b602e814471653bafed037883dbd

2017-02-11

I've committed upstream a new interface for signal mask accessor in threads.

TODO:
 - PT_SUSPEND
 - PT_RESUME
 - PT_SETDBREGS / PT_GETDBREGS
 - TRAP_SCX
 - TRAP_SCE

Evaluate:
 - PT_STEPIN (PT_STEP and PY_SYSCALL)

Commits today:
http://mail-index.netbsd.org/source-changes/2017/02/12/msg081939.html
http://mail-index.netbsd.org/source-changes/2017/02/12/msg081940.html
http://mail-index.netbsd.org/source-changes/2017/02/12/msg081941.html
http://mail-index.netbsd.org/source-changes/2017/02/11/msg081932.html
http://mail-index.netbsd.org/source-changes/2017/02/11/msg081930.html

2017-02-12

It's a checkpoint day. I'm upgrading my system, rebuilding packages.. and
clearing the room for PT_SETDBREGS/PT_GETDBREGS patch.

I started to prototype this code.

There is also one nit needed in PT_SET_SIGMASK, I need to call  sigcantmask.

Commits today:
http://mail-index.netbsd.org/source-changes/2017/02/12/msg081958.html

2017-02-13

It took me the whole day to write a report on the TNF blog.. about the finished
task on upstreaming the code.

Commits today:
http://mail-index.netbsd.org/source-changes/2017/02/13/msg081993.html

https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=6a962bfa08fd972176799f0497193db9867994b1

2017-02-14

It was a long day, really long. I was working on switching watchpoints API to
PT_GETDBREGS and PT_SETDBREGS... I have some progress as I'm now able to save
content of dbreg in the kernel.. and a process hangs on a breakpoint without
returning to user.

The first draft:
http://netbsd.org/~kamil/dbg/dbg100.txt

Commits today:
http://mail-index.netbsd.org/source-changes/2017/02/14/msg082031.html

2017-02-15

I used this day to upgrade my environment and rethink the design.

Commits today:
http://mail-index.netbsd.org/source-changes/2017/02/15/msg082040.html

2017-02-16

I was debugging why Debug Registers don't work with ATF tests.

Commits today:
http://mail-index.netbsd.org/source-changes/2017/02/16/msg082067.html
http://mail-index.netbsd.org/source-changes/2017/02/17/msg082079.html
http://mail-index.netbsd.org/source-changes/2017/02/17/msg082080.html
http://mail-index.netbsd.org/source-changes/2017/02/17/msg082081.html

2017-02-17

It was a long day, I've finally fixed the bug and made PT_GETDBREGS and
PT_SETDBREGS functional. The culrpit was missing PT_CONTINUE in the tests.

Commits today:
http://mail-index.netbsd.org/source-changes/2017/02/18/msg082132.html
http://mail-index.netbsd.org/source-changes/2017/02/18/msg082131.html
http://mail-index.netbsd.org/source-changes/2017/02/17/msg082122.html
http://mail-index.netbsd.org/source-changes/2017/02/17/msg082120.html

2017-02-18

It's surprise - GDB works a bit with PT_GETDBREGS / PT_SETDBREGS!

(gdb) c
Continuing.

Watchpoint 2: traceme

Old value = 0
New value = 16
main (argc=1, argv=0x7f7fff79fe30) at test.c:8
8               printf("traceme=%d\n", traceme);
(gdb) c
Continuing.

Watchpoint 2 deleted because the program has left the block in
which its expression is valid.
0x00007f7e3c000782 in _rtld_bind_start () from /usr/libexec/ld.elf_so
(gdb) c
Continuing.

Program received signal SIGTRAP, Trace/breakpoint trap.
0x00007f7e3c0007b5 in _rtld_bind_start () from /usr/libexec/ld.elf_so
(gdb) c
Continuing.
traceme=16
traceme=17
[Inferior 1 (process 28797) exited normally]


I don't get the _rtld_bind_start issue from dynamic linker, but I will skip it
now.

I'm evaluating how to improve this interface to be more reliable and shadowed
registers reflect the really set ones. I noted that the current approach is
quite messing with them.. perhaps harmless but still I wouldn't like to set
them randomly to unrelated processes.

I was also relaxing with edb-debugger(-git) port to NetBSD.. I was surprised
that the so called FreeBSD and OpenBSD ports are long dead and irrelevant with
the current code.. also with incomplete functionality. I want to reuse this
debugger as a sandbox to verify ptrace calls, mostly for register accessors.

This means that edb cannot be finished quickly, although it's quite simple
tracer when compared to LLDB or GDB.. I was also told today that 64-bit
debugger on 64-bit kernel cannot trace 32-bit software on NetBSD..

There are 10 days left to finish my current segment.

I think the best plan is to:
 [1] add many more tests for amd64 for dbregs in ATF
 [2] revampt the interace to more accurate shadowing proof-read for bugs
 [3] i386 port tests, test with ATF tests; Xen tests
 [4] commit dbregs; fix PT_SET_SIGMASK/PT_GET_SIGMASK and commit;
     commit API in ptrace(2) PT_SET_SIGMASK and PT_GET_SIGMASK; kernel bump;
     prepare patch for lldb upstream for PT_GETDBREGS/PT_SETDBREGS and send
     to review
 [5] PT_SYSCALL add first ATF tests
 [6] ad PT_SYSCALLEMU tests
 [7] research TRAP_SCE and TRAP_SCX
 [8] prepare tests for TRAP_SCE and TRAP_SCX
 [9] commit TRAP_SCE and TRAP_SCX
[10] catchup with delayed tasks.. write blog entry

.... and time to reflect it with reality, I'm good in underestimating the
needed work...

Commits today:
http://mail-index.netbsd.org/pkgsrc-changes/2017/02/18/msg153130.html

2017-02-19

[1] is partially done, I wanted to add more tests at least 150 more.. however
it's good enough to move on.

Status for arch/amd64 tests..

Summary for 6 test programs:
    360 passed test cases.
    0 failed test cases.
    0 expected failed test cases.
    0 skipped test cases.

I wanted to add more tests, verifying at least that registers are not inherited
on fork(2) neither vfork(2).

Commits today:
http://mail-index.netbsd.org/source-changes/2017/02/19/msg082187.html
http://mail-index.netbsd.org/source-changes/2017/02/19/msg082188.html
http://mail-index.netbsd.org/source-changes/2017/02/20/msg082192.html
http://mail-index.netbsd.org/source-changes/2017/02/20/msg082194.html
http://mail-index.netbsd.org/source-changes/2017/02/20/msg082196.html
http://mail-index.netbsd.org/source-changes/2017/02/20/msg082198.html
http://mail-index.netbsd.org/source-changes/2017/02/20/msg082200.html
http://mail-index.netbsd.org/source-changes/2017/02/20/msg082204.html
http://mail-index.netbsd.org/source-changes/2017/02/20/msg082207.html
http://mail-index.netbsd.org/source-changes/2017/02/20/msg082208.html
http://mail-index.netbsd.org/source-changes/2017/02/20/msg082210.html

2017-02-20

I was forced to refactor the existing code.

2 - mostly done - amd64 has been revamped to shadow correctly registers;
| I optimized the code to stop allocating dbregs if they are not used;
| i386 and netbsd32 compat must catch up; I fixed a bug in netbsd32 that
| execve(2) wasn't clearing inherited; I fixed lwp_cpu_free bug and moved
| to lwp_cpu_free2; I added new ATF tests for execve(2) scenarios; TODO
| i386, TODO netbsd32; TODO fork(2) ATF tests; TODO vfork(2) ATF tests

netbsd.org/~kamil/dbg/dbg104.txt

Commits today:
http://mail-index.netbsd.org/source-changes/2017/02/21/msg082244.html

2017-02-21

I've fixed the code and validated on real i386 machine - Pentium IV.

Commits today:
http://mail-index.netbsd.org/source-changes/2017/02/22/msg082272.html
http://mail-index.netbsd.org/source-changes/2017/02/22/msg082263.html
http://mail-index.netbsd.org/source-changes/2017/02/22/msg082262.html

2017-02-22

5 - done modulo PT_RESUME with some bug aboard - I filed a PR for it:
kern/51995: ptrace(2) PT_RESUME is not reliable

All patches sent upstream:

https://reviews.llvm.org/D30287 Introduce support for Debug Registers in
RegisterContextNetBSD_x86_64

https://reviews.llvm.org/D30288 Switch NetBSD from paccept(2) to accept4(2)

Commits today:
http://mail-index.netbsd.org/source-changes/2017/02/22/msg082291.html
http://mail-index.netbsd.org/source-changes/2017/02/23/msg082292.html
http://mail-index.netbsd.org/source-changes/2017/02/23/msg082299.html
http://mail-index.netbsd.org/source-changes/2017/02/23/msg082300.html
http://mail-index.netbsd.org/source-changes/2017/02/23/msg082301.html
http://mail-index.netbsd.org/source-changes/2017/02/23/msg082302.html
http://mail-index.netbsd.org/source-changes/2017/02/23/msg082303.html

https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=25e8894c05be10ad6491d529b521b1e2a6add36e
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=953e6f407af6ff93688febd1958722f5c75019d1
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=1fcc788326d631f3f7cb229c549377e24a0a83b0
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=a4fee54c53ac1d8222d13df9574d52f165b6e906
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=2099225b92a5fbe6e4ecbcf4e8bb068cba1f4f0b
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=001eb5d2a92cf211b8edee2a7f799080e3c3833c

2017-02-23

5- done; PT_SYSCALL works as expected and I'm happy with it! I was worried that
there are some surprises there but it looks fine!

Commits today:
http://mail-index.netbsd.org/source-changes/2017/02/24/msg082332.html
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=5ed17069bca598ad13012a93762ba2560d39a468

2017-02-24

I was too quick. PT_SYSCALL cannot stop on syscall entry.. it's similar to
PTRACE_VFORK issue.

I will try to dig it till end of February.

sysctl -w proc.2320.stopexec=1 works.. so there is a hope.

Commits today:
NONE

2017-02-25

I was working on a tool to prepare slides about porting ptrace(2) software to
NetBSD. py-beampy had a lot of diverse dependencies like latex, dvisvgm,
pdf2svg, inkscape and others.. and in the final result it didn't work... I gave
up on it and moved on to packaging py-landslide and it nicely worked.

Commits today:
http://mail-index.netbsd.org/source-changes/2017/02/25/msg082349.html

2017-02-26

I was working on ptrace(2) porting to NetBSD slides.

Commits today:
NONE

2017-02-27

I've finished the slides and I got proof-reading review from spz.

netbsd.org/~kamil/ptrace-netbsd/presentation.html

Commits today:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=5db8249c6c8687628c3713f972e4944677e826f9
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=32e57f9b1c6e7dcf5fca44f6530bd169c310e88e
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=7951272e0ad83e7f43c30a931b48eae6efc50862

2017-02-28

It was a general day with preparations for moving on to LLDB.

I also included TRAP_SCE and TRAP_SCX in siginfo(2).

Commits today:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=dd8da58fc2eb9ab0b3e0200d3c8fcea36862d184
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=0fbc456a9612d659f9b93891a88d00d2a848d6f4
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=1ba1cb091693d2ea9fd23fea5c6bf03ee1c6e291

http://mail-index.netbsd.org/source-changes/2017/03/01/msg082432.html
http://mail-index.netbsd.org/source-changes/2017/02/28/msg082427.html
http://mail-index.netbsd.org/source-changes/2017/02/28/msg082426.html
http://mail-index.netbsd.org/source-changes/2017/02/28/msg082425.html

2017-03-01

This was the first day on LLDB again!

I dropped a mail to LLDB and tech-userlevel:

Hello,

The contract for the LLDB port on NetBSD has been prolonged by The
NetBSD Foundation. The additional time will cover the features that were
delayed in order to address blockers that were unveiled during the work
that has been done.

I've summarized the newly finished task segment in this blog entry:

http://blog.netbsd.org/tnf/entry/ptrace_2_tasks_segment_finished

My current plan is to return to LLDB and finish the following tasks:
  I. Register context and breakpoints support on NetBSD/amd64.
 II. NetBSD Threads support
III. NetBSD/i386 (32-bit x86) support.

To finalize the first goal I use LLVM/Clang/LLDB SVN rev. 296360 as the
base for my local patches. I work in pkgsrc-wip/lldb-netbsd and I
develop there local patches.

The current Test Suite status reports 267/1235 tests passed
successfully. This number of passing tests is expected to start growing
once the goals will be achieved and LLDB will be rendered into a
functional debugger on NetBSD.

===================
Test Result Summary
===================
Test Methods:       1235
Reruns:                1
Success:             267
Expected Failure:     21
Failure:             332
Error:               167
Exceptional Exit:      0
Unexpected Success:    1
Skip:                444
Timeout:               3
Expected Timeout:      0

http://netbsd.org/~kamil/lldb/check-lldb-r296360-2017-02-28.txt


<This work is sponsored by The NetBSD Foundation.>


I dropped also this mail to christos@ and agc@ (part of it):

I'm working on Register Context support in NetBSD. In general there is
no distinct line between NetBSD threads and all other topics. This is
why I'm de facto working on both topics concurrently -- however
currently without goal to handle more than 1 thread within the process.

Right now I'm trying to put all the needed NetBSD specific support into
NetBSD-specific Process Plugin code. This might lead to monstrous design
as thread-specific protocol needs to talk with the whole process on
NetBSD. But for now my ultimate goal is to make it functional, later I
will work with upstream to streamline the generic code to handle and
land NetBSD support without clumsy hacks.

As usual, the existing FreeBSD code is useless. Darwin/Windows code
isn't using the generic framework, at least not in the extend to
delinuxize it.

I've imported the Linux code for Register Context and adapted it for
build, however the art is to make it properly functional. So far I skip
Debug Registers out there to simplify the initial milestones: set and
catch breakpoint + print backtrace.


I also filed all the remaining bugs in ptrace(2) to move on..

Changes today:
http://mail-index.netbsd.org/source-changes/2017/03/01/msg082438.html
http://mail-index.netbsd.org/source-changes/2017/03/01/msg082443.html
http://mail-index.netbsd.org/source-changes/2017/03/01/msg082444.html

https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commit;h=329d9e27b8b8adb9887a80add866fdf28f7b1540
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commit;h=d1cc9dcc31a01c52ef66c92064c8565941c13cb6
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commit;h=614e3cf38afe05f399eec7b48d818e9760e98984
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commit;h=5f38833cd08a56e1a851d66611243e004c38a003
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commit;h=2642a5b494df439f32bd6898e1180d328eecb31b
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commit;h=63dc66dbf404bf10f08ad4c538ebb44d95e800fe
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commit;h=c88f38d2211bb34b9546ebeabca2dfa33c4e2b24

2017-03-02

It was a long day. I was learning the GDB Remote Protocol and I was debugging
the debugger checking the content from the RSP messages. I was comparing it
with Linux as well. In general there are 3 initial differences I would like to
squash:
 - YMM registers are reported on Linux while missing on NetBSD
 - QPassSignal is used on Linux, missing on NetBSD
 - ELF AUXV reading is used on Linux, missing on NetBSD

Later there is something wrong with reading memory address in tracee.. it tries
to read with PT_IO and for some unknown reason I'm getting EINVAL from the
kernel.

Today I understood that FreeBSD approach with PT_SETSTEP and PT_CLEARSTEP might
be the best. As it handles PT_SYSCALL & PT_STEP and PT_STEP & emitting a
signal.

Commits today:
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=a511ea555fc92e6422a4f715452ef65627c34116
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=0a8847620c511ef596f0d6343ab9f66b1b3ee8bd
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=d2a8e52387e9d43ad0647403eacbf47b5b1cf792

http://mail-index.netbsd.org/source-changes/2017/03/02/msg082477.html

2017-03-03

I got to the point that I know what's going on with invalid read of memory.
The lldb client gets invalid state of SP (stack pinter register).. I need to
address it.

Commits today:
http://mail-index.netbsd.org/source-changes/2017/03/03/msg082483.html
http://mail-index.netbsd.org/source-changes/2017/03/03/msg082484.html


https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=e27f38daefe35c0256d7264ce1b4fb12aae606c6
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=fea381bedbaf7ba83d28959ec5f87ddb5ddf70f8
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=e3379da83678f5f0e0471fcfeed08ad26a6f3b92
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=363249dc817e5b20d1651c566903b8d1e118832d
https://wip.pkgsrc.org/cgi-bin/gitweb.cgi?p=pkgsrc-wip.git;a=commitdiff;h=9a26092c358ebdb9094d3648cd5cd3645f6aa9c8

2017-03-04 -> ...

I decided that daily logs turned into snapshots of commits. I have no extra
time to produce quality entries here, so I'm moving on to quality monthly
reports on the NetBSD Foundation blog entirely.