10 Commits

Author SHA1 Message Date
David van Moolenbroek
27852ebe53 UDS: full rewrite
This new implementation of the UDS service is built on top of the
libsockevent library.  It thereby inherits all the advantages that
libsockevent brings.  However, the fundamental restructuring
required for that change also paved the way for resolution of a
number of other important open issues with the old UDS code.  Most
importantly, the rewrite brings the behavior of the service much
closer to POSIX compliance and NetBSD compatibility.  These are the
most important changes:

- due to the use of libsockevent, UDS now supports multiple suspending
  calls per socket and a large number of standard socket flags and
  options;
- socket address matching is now based on <device,inode> lookups
  instead of canonized path names, and socket addresses are no longer
  altered either due to canonization or at connect time;
- the socket state machine is now well defined, most importantly
  resolving the erroneous reset-on-EOF semantics of the old UDS, but
  also allowing socket reuse;
- sockets are now connected before being accepted instead of being
  held in connecting state, unless the LOCAL_CONNWAIT option is set
  on either the connecting or the listening socket;
- connect(2) on datagram sockets is now supported (needed by syslog),
  and proper datagram socket disconnect notification is provided;
- the receive queue now supports segmentation, associating ancillary
  data (in-flight file descriptors and credentials) with each segment
  instead of being kept fully separately; this is a POSIX requirement
  (and needed by tmux);
- as part of the segmentation support, the receive queue can now hold
  as many packets as can fit, instead of one;
- in addition to the flags supported by libsockevent, the MSG_PEEK,
  MSG_WAITALL, MSG_CMSG_CLOEXEC, MSG_TRUNC, and MSG_CTRUNC send and
  receive flags are now supported;
- the SO_PASSCRED and SO_PEERCRED socket options are replaced by
  LOCAL_CREDS and LOCAL_PEEREID respectively, now following NetBSD
  semantics and allowing use of NetBSD libc's getpeereid(3);
- memory usage is reduced by about 250 KB due to centralized in-flight
  file descriptor tracking, with a limit of OPEN_MAX total rather than
  of OPEN_MAX per socket;
- memory usage is reduced by another ~50 KB due to removal of state
  redundancy, despite the fact that socket path names may now be up to
  253 bytes rather than the previous 104 bytes;
- compared to the old UDS, there is now very little direct indexing on
  the static array of sockets, thus allowing dynamic allocation of
  sockets more easily in the future;
- the UDS service now has RMIB support for the net.local sysctl tree,
  implementing preliminary support for NetBSD netstat(1).

Change-Id: I4a9b6fe4aaeef0edf2547eee894e6c14403fcb32
2017-03-09 23:39:56 +00:00
David van Moolenbroek
40dec70c39 trace(1): print sin6_scope_id when relevant
Site-local addresses are out, as they are RFC-deprecated and not
supported on MINIX 3 at all.  Interface-local and link-local multicast
addresses are in, because they are relevant in the context of a
particular zone ID only.

Change-Id: I64a9ecb472946f717f27a72c4073d78aa1120508
2017-02-16 10:21:56 +00:00
David van Moolenbroek
3ac58492b3 Add LLVM GCOV coverage support
With this patch, it is now possible to generate coverage information
for MINIX3 system services with LLVM.  In particular, the system can
be built with MKCOVERAGE=yes, either with a native "make build" or
with crosscompilation.  Either way, MKCOVERAGE=yes will build the
MINIX3 system services with coverage profiling support, generating a
.gcno file for each source module.  After a reboot it is possible to
obtain runtime coverage data (.gcda files) for individual system
services using gcov-pull(8).  The combination of the .gcno and .gcda
files can then be inspected with llvm-cov(1).

For reasons documented in minix.gcov.mk, only system service program
modules are supported for now; system service libraries (libsys etc.)
are not included.  Userland programs are not affected by MKCOVERAGE.

The heart of this patch is the libsys code that writes data generated
by the LLVM coverage hooks into a serialized format using the routines
we already had for GCC GCOV.  Unfortunately, the new llvm_gcov.c code
is LLVM ABI dependent, and may therefore have to be updated later when
we upgrade LLVM.  The current implementation should support all LLVM
versions 3.x with x >= 4.

The rest of this patch is mostly a light cleanup of our existing GCOV
infrastructure, with as most visible change that gcov-pull(8) now
takes a service label string rather than a PID number.

Change-Id: I6de055359d3d2b3f53e426f3fffb17af7877261f
2016-09-24 22:18:31 +00:00
David van Moolenbroek
c38dbb97aa Prepare for switch to native BSD socket API
Currently, the BSD socket API is implemented in libc, translating the
API calls to character driver operations underneath.  This approach
has several issues:

- it is inefficient, as most character driver operations are specific
  to the socket type, thus requiring that each operation start by
  bruteforcing the socket protocol family and type of the given file
  descriptor using several system calls;
- it requires that libc itself be changed every time system support
  for a new protocol is added;
- various parts of the libc implementations violate the asynchronous
  signal safety POSIX requirements.

In order to resolve all these issues at once, the plan is to turn the
BSD socket calls into system calls, thus making the BSD socket API the
"native" ABI, removing the complexity from libc and instead letting
VFS deal with the socket calls.

The overall change is going to break all networking functionality. In
order to smoothen the transition, this patch introduces the fifteen
new BSD socket system calls, and makes libc try these first before
falling back on the old behavior.  For now, the VFS implementations of
the new calls fail such that libc will always use the fallback cases.
Later on, when we introduce the actual implementation of the native
BSD socket calls, all statically linked programs will automatically
use the new ABI, thus limiting actual application breakage.

In other words: by itself, this patch does nothing, except add a bit
of transitional overhead that will disappear in the future.  The
largest part of the patch is concerned with adding full support for
the new BSD socket system calls to trace(1) - this early addition has
the advantage of making system call tracing output of several socket
calls much more readable already.

Both the system call interfaces and the trace(1) support have already
been tested using code that will be committed later on.

Change-Id: I3460812be50c78be662d857f9d3d6840f3ca917f
2016-02-23 14:34:05 +00:00
David van Moolenbroek
c33d6ef392 VFS: start off cleanup of pipe2 IPC message
There is no reason to use a single message for nonoverlapping requests
and replies combined, and in fact splitting them out allows reuse of
messages and avoids various problems with field layouts.  Since the
upcoming socketpair(2) system call will be using the same reply as
pipe2(2), split up the single message used for the latter.  In order
to keep the used parts of messages at the front, start a transitional
phase to move the pipe(2) flags field to the front of its request.

Change-Id: If3f1c3d348ec7e27b7f5b7147ce1b9ef490dfab9
2016-02-22 23:23:02 +00:00
David van Moolenbroek
b58e161ccb trace(1): resolve all level-5 LLVM warnings
Change-Id: If5ffe97eb0b15387b1ab674657879e13f58fb27e
2016-01-16 14:04:15 +01:00
David van Moolenbroek
bc2d75fa05 Rework getrusage(2) infrastructure
- the userland call is now made to PM only, and PM relays the call to
  other servers as appropriate; this is an ABI change that will
  ultimately allow us to add proper support for wait3() and the like;
  for the moment there is backward compatibility;
- the getrusage-specific kernel subcall has been removed, as it
  provided only redundant functionality, and did not provide the means
  to be extended correctly in the future - namely, allowing the kernel
  to return different values depending on whether resource usage of
  the caller (self) or its children was requested;
- VM is now told whether resource usage of the caller (self) or its
  children is requested, and it refrains from filling in wrong values
  for information it does not have;
- VM now uses the correct unit for the ru_maxrss values;
- VFS is cut out of the loop entirely, since it does not provide any
  values at the moment; a comment explains how it should be readded.

Change-Id: I27b0f488437dec3d8e784721c67b03f2f853120f
2015-09-28 14:06:59 +00:00
David van Moolenbroek
cd27b2627a getrusage(2): zero out ru_i[xds]rss fields
The current values were both inaccurate (especially for dynamically
linked executables) and using the wrong unit (bytes, instead of
kilobytes times ticks-of-execution).  For now we are better off not
populating these fields at all.

Change-Id: I195a8fa8db909e64a833eec25f59c9ee0b89bdc5
2015-09-28 14:06:58 +00:00
David van Moolenbroek
424cad2cd6 VFS: add support for F_DUPFD_CLOEXEC
Change-Id: Ibe422c6c99fe5fd1385884843ff9e15111810309
2015-07-20 13:55:10 +00:00
David van Moolenbroek
521fa314e2 Add trace(1): the MINIX3 system call tracer
Change-Id: Ib970c8647409196902ed53d6e9631a1673a4ab2e
2014-11-04 21:46:31 +00:00