phunix

Author	SHA1	Message	Date
David van Moolenbroek	ef8d499e2d	Add lwip: a new lwIP-based TCP/IP service This commit adds a new TCP/IP service to MINIX 3. As its core, the service uses the lwIP TCP/IP stack for maintenance reasons. The service aims to be compatible with NetBSD userland, including its low-level network management utilities. It also aims to support modern features such as IPv6. In summary, the new LWIP service has support for the following main features: - TCP, UDP, RAW sockets with mostly standard BSD API semantics; - IPv6 support: host mode (complete) and router mode (partial); - most of the standard BSD API socket options (SO_); - all of the standard BSD API message flags (MSG_); - the most used protocol-specific socket and control options; - a default loopback interface and the ability to create one more; - configuration-free ethernet interfaces and driver tracking; - queuing and multiple concurrent requests to each ethernet driver; - standard ioctl(2)-based BSD interface management; - radix tree backed, destination-based routing; - routing sockets for standard BSD route reporting and management; - multicast traffic and multicast group membership tracking; - Berkeley Packet Filter (BPF) devices; - standard and custom sysctl(7) nodes for many internals; - a slab allocation based, hybrid static/dynamic memory pool model. Many of its modules come with fairly elaborate comments that cover many aspects of what is going on. The service is primarily a socket driver built on top of the libsockdriver library, but for BPF devices it is at the same time also a character driver. Change-Id: Ib0c02736234b21143915e5fcc0fda8fe408f046f	2017-04-30 13:16:03 +00:00
David van Moolenbroek	27852ebe53	UDS: full rewrite This new implementation of the UDS service is built on top of the libsockevent library. It thereby inherits all the advantages that libsockevent brings. However, the fundamental restructuring required for that change also paved the way for resolution of a number of other important open issues with the old UDS code. Most importantly, the rewrite brings the behavior of the service much closer to POSIX compliance and NetBSD compatibility. These are the most important changes: - due to the use of libsockevent, UDS now supports multiple suspending calls per socket and a large number of standard socket flags and options; - socket address matching is now based on <device,inode> lookups instead of canonized path names, and socket addresses are no longer altered either due to canonization or at connect time; - the socket state machine is now well defined, most importantly resolving the erroneous reset-on-EOF semantics of the old UDS, but also allowing socket reuse; - sockets are now connected before being accepted instead of being held in connecting state, unless the LOCAL_CONNWAIT option is set on either the connecting or the listening socket; - connect(2) on datagram sockets is now supported (needed by syslog), and proper datagram socket disconnect notification is provided; - the receive queue now supports segmentation, associating ancillary data (in-flight file descriptors and credentials) with each segment instead of being kept fully separately; this is a POSIX requirement (and needed by tmux); - as part of the segmentation support, the receive queue can now hold as many packets as can fit, instead of one; - in addition to the flags supported by libsockevent, the MSG_PEEK, MSG_WAITALL, MSG_CMSG_CLOEXEC, MSG_TRUNC, and MSG_CTRUNC send and receive flags are now supported; - the SO_PASSCRED and SO_PEERCRED socket options are replaced by LOCAL_CREDS and LOCAL_PEEREID respectively, now following NetBSD semantics and allowing use of NetBSD libc's getpeereid(3); - memory usage is reduced by about 250 KB due to centralized in-flight file descriptor tracking, with a limit of OPEN_MAX total rather than of OPEN_MAX per socket; - memory usage is reduced by another ~50 KB due to removal of state redundancy, despite the fact that socket path names may now be up to 253 bytes rather than the previous 104 bytes; - compared to the old UDS, there is now very little direct indexing on the static array of sockets, thus allowing dynamic allocation of sockets more easily in the future; - the UDS service now has RMIB support for the net.local sysctl tree, implementing preliminary support for NetBSD netstat(1). Change-Id: I4a9b6fe4aaeef0edf2547eee894e6c14403fcb32	2017-03-09 23:39:56 +00:00
David van Moolenbroek	40dec70c39	trace(1): print sin6_scope_id when relevant Site-local addresses are out, as they are RFC-deprecated and not supported on MINIX 3 at all. Interface-local and link-local multicast addresses are in, because they are relevant in the context of a particular zone ID only. Change-Id: I64a9ecb472946f717f27a72c4073d78aa1120508	2017-02-16 10:21:56 +00:00
David van Moolenbroek	10a44c0ee2	trace(1): add basic support for timestamps This patch adds strace-like support for a -t command line option, which causes a timestamp to be printed at the beginning of each line. If the option is given more than once, the output will also include microseconds. Change-Id: I8cda581651859448c154b01815cc49d915b7b354	2016-12-28 13:06:04 +00:00
David van Moolenbroek	3ac58492b3	Add LLVM GCOV coverage support With this patch, it is now possible to generate coverage information for MINIX3 system services with LLVM. In particular, the system can be built with MKCOVERAGE=yes, either with a native "make build" or with crosscompilation. Either way, MKCOVERAGE=yes will build the MINIX3 system services with coverage profiling support, generating a .gcno file for each source module. After a reboot it is possible to obtain runtime coverage data (.gcda files) for individual system services using gcov-pull(8). The combination of the .gcno and .gcda files can then be inspected with llvm-cov(1). For reasons documented in minix.gcov.mk, only system service program modules are supported for now; system service libraries (libsys etc.) are not included. Userland programs are not affected by MKCOVERAGE. The heart of this patch is the libsys code that writes data generated by the LLVM coverage hooks into a serialized format using the routines we already had for GCC GCOV. Unfortunately, the new llvm_gcov.c code is LLVM ABI dependent, and may therefore have to be updated later when we upgrade LLVM. The current implementation should support all LLVM versions 3.x with x >= 4. The rest of this patch is mostly a light cleanup of our existing GCOV infrastructure, with as most visible change that gcov-pull(8) now takes a service label string rather than a PID number. Change-Id: I6de055359d3d2b3f53e426f3fffb17af7877261f	2016-09-24 22:18:31 +00:00
David van Moolenbroek	764cd267a7	INET/LWIP: minimal net.route sysctl support At a point not too far in the future, we will be switching from the hardcoded MINIX3 implementation of the getifaddrs(3) libc routine to the proper NetBSD implementation. The latter uses the net.route.rtable sysctl functionality to obtain its information. In order make the transition as painless as possible, this patch adds basic support for that net.route.rtable functionality to INET and LWIP, using the remote MIB (RMIB) facility. Change-Id: I54f5cea7985f6606e317c73a5e6be3a5d07bc7dc	2016-06-18 12:47:30 +00:00
David van Moolenbroek	c38dbb97aa	Prepare for switch to native BSD socket API Currently, the BSD socket API is implemented in libc, translating the API calls to character driver operations underneath. This approach has several issues: - it is inefficient, as most character driver operations are specific to the socket type, thus requiring that each operation start by bruteforcing the socket protocol family and type of the given file descriptor using several system calls; - it requires that libc itself be changed every time system support for a new protocol is added; - various parts of the libc implementations violate the asynchronous signal safety POSIX requirements. In order to resolve all these issues at once, the plan is to turn the BSD socket calls into system calls, thus making the BSD socket API the "native" ABI, removing the complexity from libc and instead letting VFS deal with the socket calls. The overall change is going to break all networking functionality. In order to smoothen the transition, this patch introduces the fifteen new BSD socket system calls, and makes libc try these first before falling back on the old behavior. For now, the VFS implementations of the new calls fail such that libc will always use the fallback cases. Later on, when we introduce the actual implementation of the native BSD socket calls, all statically linked programs will automatically use the new ABI, thus limiting actual application breakage. In other words: by itself, this patch does nothing, except add a bit of transitional overhead that will disappear in the future. The largest part of the patch is concerned with adding full support for the new BSD socket system calls to trace(1) - this early addition has the advantage of making system call tracing output of several socket calls much more readable already. Both the system call interfaces and the trace(1) support have already been tested using code that will be committed later on. Change-Id: I3460812be50c78be662d857f9d3d6840f3ca917f	2016-02-23 14:34:05 +00:00
David van Moolenbroek	c33d6ef392	VFS: start off cleanup of pipe2 IPC message There is no reason to use a single message for nonoverlapping requests and replies combined, and in fact splitting them out allows reuse of messages and avoids various problems with field layouts. Since the upcoming socketpair(2) system call will be using the same reply as pipe2(2), split up the single message used for the latter. In order to keep the used parts of messages at the front, start a transitional phase to move the pipe(2) flags field to the front of its request. Change-Id: If3f1c3d348ec7e27b7f5b7147ce1b9ef490dfab9	2016-02-22 23:23:02 +00:00
David van Moolenbroek	b58e161ccb	trace(1): resolve all level-5 LLVM warnings Change-Id: If5ffe97eb0b15387b1ab674657879e13f58fb27e	2016-01-16 14:04:15 +01:00
David van Moolenbroek	2f09e77b82	MIB: add support for System V IPC information node The kernel.ipc.sysvipc_info node is the gateway from NetBSD ipcs(1) and ipcrm(1) to the IPC server, and thus necessary for a clean import of these two utilities. The MIB service implementation uses the preexisting (Linux-specific) information calls on the IPC server to obtain the information. Change-Id: I85d1e193162d6b689f114764254dd7f314d2cfa0	2016-01-16 14:04:12 +01:00
David van Moolenbroek	4d272e5a97	IPC server: NetBSD sync, general improvements - switch to the NetBSD identifier system; it is not only better, but also required for porting NetBSD ipcs(1) and ipcrm(1); however, it requires that slots not be moved, and that results in some changes; - synchronize some other things with NetBSD: where keys are kept, as well as various non-permission mode flags; - fix semctl(2) vararg retrieval and message field type; - use SUSPEND instead of weird reply exceptions in the call table; - fix several memory leaks and at least one missing permission check; - improve the atomicity of semop(2) by a small amount, even though its atomicity is still broken at a fundamental level; - use the new cheaper way to retrieve the current time; - resolve all level-5 LLVM warnings. Change-Id: I0c47aacde478b23bb77d628384aeab855a22fdbf	2016-01-16 13:58:47 +01:00
David van Moolenbroek	d991a2bea3	Retire sysuname(2), synchronize sys/utsname.h Now that uname(3) uses sysctl(2), we no longer need sysuname(2). Backward compatibility is retained for old statically linked binaries for a short while. Also remove the now-obsolete MINIX3-specific "arch" field from the utsname structure. While this is an ABI break at the libc level, it should pose no problems in practice, because: - statically linked programs (i.e., all of the base system) are not affected, as they will use headers synchronized with libc; - the structure is getting smaller, thus, older dynamically linked programs (typically in pkgsrc) using the new libc will end up with garbage in the "arch" field, but it is unlikely they will use this field anyway, since it was specific to MINIX3; - new dynamically linked programs using an old libc could end up with memory corruption, but this is not a scenario that is expected to occur in the first place - certainly not with programs from pkgsrc. Change-Id: I29c76576f509feacc8f996f0bd353ca8961d4917	2016-01-13 20:32:46 +01:00
David van Moolenbroek	25d39513e7	MIB: initial tree population Change-Id: I28ef0a81a59faaf341bfc15178df89474779a136	2016-01-13 20:32:44 +01:00
David van Moolenbroek	e4e21ee1b2	Add MIB service, sysctl(2) support The new MIB service implements the sysctl(2) system call which, as we adopt more NetBSD code, is an increasingly important part of the operating system API. The system call is implemented in the new service rather than as part of an existing service, because it will eventually call into many other services in order to gather data, similar to ProcFS. Since the sysctl(2) functionality is used even by init(8), the MIB service is added to the boot image. MIB stands for Management Information Base, and the MIB service should be seen as a knowledge base of management information. The MIB service implementation of the sysctl(2) interface is fairly complete; it incorporates support for both static and dynamic nodes and imitates many NetBSD-specific quirks expected by userland. The patch also adds trace(1) support for the new system call, and adds a new test, test87, which tests the fundamental operation of the MIB service rather thoroughly. Change-Id: I4766b410b25e94e9cd4affb72244112c2910ff67	2016-01-13 20:32:37 +01:00
David van Moolenbroek	23199f6205	RS: allow service program name to be overridden Until now, the program name of a service was always the file name (without directory) of the service binary. The program name is used to, among other things, find the corresponding system.conf entry. With ASR moving to a situation where all rerandomized service binaries are stored in a single directory, this can no longer be maintained. Instead, the service(8) command can now be instructed to override the service program name, using its new -progname option. Change-Id: I981e9b35232c88048d8804ec5eca58d1e4a5db82	2016-01-13 20:32:31 +01:00
David van Moolenbroek	29346ab043	PM: add support for wait4(2) This patch adds support for the wait4 system call, and with that the wait3 call as well. The implementation is absolutely minimal: only user and system times of the exited child are returned (with all other rusage fields left zero), and there is no support for tracers. Still, this should cover the main use cases of wait4. Change-Id: I7a04589a8423a23990ab39aa38e85d535556743a	2015-09-29 18:15:28 +00:00
David van Moolenbroek	bc2d75fa05	Rework getrusage(2) infrastructure - the userland call is now made to PM only, and PM relays the call to other servers as appropriate; this is an ABI change that will ultimately allow us to add proper support for wait3() and the like; for the moment there is backward compatibility; - the getrusage-specific kernel subcall has been removed, as it provided only redundant functionality, and did not provide the means to be extended correctly in the future - namely, allowing the kernel to return different values depending on whether resource usage of the caller (self) or its children was requested; - VM is now told whether resource usage of the caller (self) or its children is requested, and it refrains from filling in wrong values for information it does not have; - VM now uses the correct unit for the ru_maxrss values; - VFS is cut out of the loop entirely, since it does not provide any values at the moment; a comment explains how it should be readded. Change-Id: I27b0f488437dec3d8e784721c67b03f2f853120f	2015-09-28 14:06:59 +00:00
David van Moolenbroek	0f8e20a12c	getrusage(2): zero out ru_nsignals field The current value was both wrong (counting spawned kernel signals rather than delivered user signals) and returned for the calling process even if the request was for the process's children. For now we are better off not populating this field at all. Change-Id: I6c660be266b5746b7c3db57ae88fa7f872961ee2	2015-09-28 14:06:58 +00:00
David van Moolenbroek	cd27b2627a	getrusage(2): zero out ru_i[xds]rss fields The current values were both inaccurate (especially for dynamically linked executables) and using the wrong unit (bytes, instead of kilobytes times ticks-of-execution). For now we are better off not populating these fields at all. Change-Id: I195a8fa8db909e64a833eec25f59c9ee0b89bdc5	2015-09-28 14:06:58 +00:00
David van Moolenbroek	20054ae93f	Kernel: separate userland ABI on kernel page Currently, the userland ABI uses a single field ('user_sp') far into the very large 'kinfo' structure on the shared kernel information page. This precludes us from modifying or getting rid of 'kinfo' in the future without breaking userland. This patch adds a separate 'kuserinfo' structure to the kernel information page, with only information that is part of the userland ABI, in an extensible manner. Userland now uses this field if it is present, and falls back to the old field if not. Change-Id: Ib7b24b53a440f40a2edc28cdfa48447ac2179288	2015-09-23 12:01:15 +00:00
David van Moolenbroek	594df55e53	Abstract away minix_kerninfo access Instead of importing an external _minix_kerninfo variable, any code using the shared kernel page should now call get_minix_kerninfo(3). Since this is the only logical name for such a function, rename the previous get_minix_kerninfo call to ipc_minix_kerninfo. Change-Id: I2e424b6fb55aa55d3da850187f1f7a0b7cbbf910	2015-09-21 15:09:04 +00:00
David van Moolenbroek	424cad2cd6	VFS: add support for F_DUPFD_CLOEXEC Change-Id: Ibe422c6c99fe5fd1385884843ff9e15111810309	2015-07-20 13:55:10 +00:00
David van Moolenbroek	da21d85025	Add PTYFS, Unix98 pseudo terminal support This patch adds support for Unix98 pseudo terminals, that is, posix_openpt(3), grantpt(3), unlockpt(3), /dev/ptmx, and /dev/pts/. The latter is implemented with a new pseudo file system, PTYFS. In effect, this patch adds secure support for unprivileged pseudo terminal allocation, allowing programs such as tmux(1) to be used by non-root users as well. Test77 has been extended with new tests, and no longer needs to run as root. The new functionality is optional. To revert to the old behavior, remove the "ptyfs" entry from /etc/fstab. Technical nodes: o The reason for not implementing the NetBSD /dev/ptm approach is that implementing the corresponding ioctl (TIOCPTMGET) would require adding a number of extremely hairy exceptions to VFS, including the PTY driver having to create new file descriptors for its own device nodes. o PTYFS is required for Unix98 PTYs in order to avoid that the PTY driver has to be aware of old-style PTY naming schemes and even has to call chmod(2) on a disk-backed file system. PTY cannot be its own PTYFS since a character driver may currently not also be a file system. However, PTYFS may be subsumed into a DEVFS in the future. o The Unix98 PTY behavior differs somewhat from NetBSD's, in that slave nodes are created on ptyfs only upon the first call to grantpt(3). This approach obviates the need to revoke access as part of the grantpt(3) call. o Shutting down PTY may leave slave nodes on PTYFS, but once PTY is restarted, these leftover slave nodes will be removed before they create a security risk. Unmounting PTYFS will make existing PTY slaves permanently unavailable, and absence of PTYFS will block allocation of new Unix98 PTYs until PTYFS is (re)mounted. Change-Id: I822b43ba32707c8815fd0f7d5bb7a438f51421c1	2015-06-23 17:43:46 +00:00
Ben Gras	3c8950cce9	minix/ changes for arm llvm build . fixes needed to build Minix/ARM with LLVM without errors, mostly size_t cleanness Change-Id: If4dd0a23bc5cb399296073920a8940c34b4caef4	2014-12-03 23:40:56 +01:00
David van Moolenbroek	8b18d03deb	trace(1): document how to add an IOCTL handler Also fix two small IOCTL-related bugs: - do not print an argument pointer for argument-less IOCTLs; - print IOCTL contents with -V given once, just like structures. Change-Id: Iec7373003d71937fd34ee4b9db6c6cec0c916411	2014-11-12 12:02:29 +00:00
Lionel Sambuc	5d8311761a	Turn PCI into a character driver Change-Id: Ia9c83af4d52e82e845b6a847c3e82e33d1920ae0	2014-11-10 14:43:27 +01:00
David van Moolenbroek	521fa314e2	Add trace(1): the MINIX3 system call tracer Change-Id: Ib970c8647409196902ed53d6e9631a1673a4ab2e	2014-11-04 21:46:31 +00:00

27 Commits