phunix

Author	SHA1	Message	Date
Arne Welzel	bf33a1c097	libsys: add sys_safememset()	2012-09-26 02:18:00 +02:00
Arne Welzel	d8a89dcbe6	kernel: add safememset() kernel call	2012-09-26 02:18:00 +02:00
Ben Gras	2d72cbec41	SYSENTER/SYSCALL support . add cpufeature detection of both . use it for both ipc and kernelcall traps, using a register for call number . SYSENTER/SYSCALL does not save any context, therefore userland has to save it . to accomodate multiple kernel entry/exit types, the entry type is recorded in the process struct. hitherto all types were interrupt (soft int, exception, hard int); now SYSENTER/SYSCALL is new, with the difference that context is not fully restored from proc struct when running the process again. this can't be done as some information is missing. . complication: cases in which the kernel has to fully change process context (i.e. sigreturn). in that case the exit type is changed from SYSENTER/SYSEXIT to soft-int (i.e. iret) and context is fully restored from the proc struct. this does mean the PC and SP must change, as the sysenter/sysexit userland code will otherwise try to restore its own context. this is true in the sigreturn case. . override all usage by setting libc_ipc=1	2012-09-24 15:53:43 +02:00
Ben Gras	ed1af3c86c	VM: full munmap complete munmap implementation; single-page references made a general munmap() implementation possible to write cleanly. . memory: let the MIOCRAMSIZE ioctl set the imgrd device size (but only to 0) . let the ramdisk command set sizes to 0 . use this command to set /dev/imgrd to 0 after mounting /usr in /etc/rc, so the boot time ramdisk is freed (about 4MB currently)	2012-09-18 13:17:52 +02:00
Ben Gras	053fa581b5	vm: remove stack handling for signals . moved to the kernel as the handling was only reading it; the kernel may as well write it too	2012-08-29 17:31:38 +02:00
Arun Thomas	697f0d097f	Rename sys_vmctl_get_cr3_i386	2012-08-12 23:30:54 +02:00
Ben Gras	b6ea15115c	kernel: facility for user-visible memory . map all objects named usermapped_.o with globally visible pages; usermapped_glo_.o with the VM 'global' bit on, i.e. permanently in tlb (very scarce resource!) . added kinfo, machine, kmessages and loadinfo for a start . modified log, tty to make use of the shared messages struct	2012-07-28 20:57:38 +00:00
Ben Gras	50e2064049	No more intel/minix segments. This commit removes all traces of Minix segments (the text/data/stack memory map abstraction in the kernel) and significance of Intel segments (hardware segments like CS, DS that add offsets to all addressing before page table translation). This ultimately simplifies the memory layout and addressing and makes the same layout possible on non-Intel architectures. There are only two types of addresses in the world now: virtual and physical; even the kernel and processes have the same virtual address space. Kernel and user processes can be distinguished at a glance as processes won't use 0xF0000000 and above. No static pre-allocated memory sizes exist any more. Changes to booting: . The pre_init.c leaves the kernel and modules exactly as they were left by the bootloader in physical memory . The kernel starts running using physical addressing, loaded at a fixed location given in its linker script by the bootloader. All code and data in this phase are linked to this fixed low location. . It makes a bootstrap pagetable to map itself to a fixed high location (also in linker script) and jumps to the high address. All code and data then use this high addressing. . All code/data symbols linked at the low addresses is prefixed by an objcopy step with __k_unpaged_, so that that code cannot reference highly-linked symbols (which aren't valid yet) or vice versa (symbols that aren't valid any more). . The two addressing modes are separated in the linker script by collecting the unpaged_.o objects and linking them with low addresses, and linking the rest high. Some objects are linked twice, once low and once high. . The bootstrap phase passes a lot of information (e.g. free memory list, physical location of the modules, etc.) using the kinfo struct. . After this bootstrap the low-linked part is freed. . The kernel maps in VM into the bootstrap page table so that VM can begin executing. Its first job is to make page tables for all other boot processes. So VM runs before RS, and RS gets a fully dynamic, VM-managed address space. VM gets its privilege info from RS as usual but that happens after RS starts running. . Both the kernel loading VM and VM organizing boot processes happen using the libexec logic. This removes the last reason for VM to still know much about exec() and vm/exec.c is gone. Further Implementation: . All segments are based at 0 and have a 4 GB limit. . The kernel is mapped in at the top of the virtual address space so as not to constrain the user processes. . Processes do not use segments from the LDT at all; there are no segments in the LDT any more, so no LLDT is needed. . The Minix segments T/D/S are gone and so none of the user-space or in-kernel copy functions use them. The copy functions use a process endpoint of NONE to realize it's a physical address, virtual otherwise. . The umap call only makes sense to translate a virtual address to a physical address now. . Segments-related calls like newmap and alloc_segments are gone. . All segments-related translation in VM is gone (vir2map etc). . Initialization in VM is simpler as no moving around is necessary. . VM and all other boot processes can be linked wherever they wish and will be mapped in at the right location by the kernel and VM respectively. Other changes: . The multiboot code is less special: it does not use mb_print for its diagnostics any more but uses printf() as normal, saving the output into the diagnostics buffer, only printing to the screen using the direct print functions if a panic() occurs. . The multiboot code uses the flexible 'free memory map list' style to receive the list of free memory if available. . The kernel determines the memory layout of the processes to a degree: it tells VM where the kernel starts and ends and where the kernel wants the top of the process to be. VM then uses this entire range, i.e. the stack is right at the top, and mmap()ped bits of memory are placed below that downwards, and the break grows upwards. Other Consequences: . Every process gets its own page table as address spaces can't be separated any more by segments. . As all segments are 0-based, there is no distinction between virtual and linear addresses, nor between userspace and kernel addresses. . Less work is done when context switching, leading to a net performance increase. (8% faster on my machine for 'make servers'.) . The layout and configuration of the GDT makes sysenter and syscall possible.	2012-07-15 22:30:15 +02:00
Ben Gras	0fb2f83da9	drop from segments physcopy/vircopy invocations . sys_vircopy always uses D for both src and dst . sys_physcopy uses PHYS_SEG if and only if corresponding endpoint is NONE, so we can derive the mode (PHYS_SEG or D) from the endpoint arg in the kernel, dropping the seg args . fields in msg still filled in for backwards compatability, using same NONE-logic in the library	2012-06-18 12:28:40 +00:00
Ben Gras	0e35eb0c6b	drop segments from safemap/safeunmap invocations	2012-06-18 12:28:40 +00:00
Ben Gras	2bfeeed885	drop segment from safecopy invocations . all invocations were S or D, so can safely be dropped to prepare for the segmentless world . still assign D to the SCP_SEG field in the message to make previous kernels usable	2012-06-16 16:22:51 +00:00
Ben Gras	769af57274	further libexec generalization . new mode for sys_memset: include process so memset can be done in physical or virtual address space. . add a mode to mmap() that lets a process allocate uninitialized memory. . this allows an exec()er (RS, VFS, etc.) to request uninitialized memory from VM and selectively clear the ranges that don't come from a file, leaving no uninitialized memory left for the process to see. . use callbacks for clearing the process, clearing memory in the process, and copying into the process; so that the libexec code can be used from rs, vfs, and in the future, kernel (to load vm) and vm (to load boot-time processes)	2012-06-07 15:15:02 +02:00
Ben Gras	ee4016155e	vm: add third-party mmap() mode and PROCCTL these two functions will be used to support all exec() functionality going into a single library shared by RS and VFS and exec() knowledge leaving VM. . third-party mmap: allow certain processes (VFS, RS) to do mmap() on behalf of another process . PROCCTL: used to free and clear a process' address space	2012-06-07 12:43:16 +02:00
Ben Gras	755102d67f	AT_SUN_EXECNAME support . vfs: pass execname in aux vectors . ld.elf_so: use this to expand $ORIGIN . this requires the executable to reserve more space at exec() calling time	2012-04-26 13:32:39 +02:00
Ben Gras	53002f6f6c	recognize and execute dynamically linked executables . generalize libexec slightly to get some more necessary information from ELF files, e.g. the interpreter . execute dynamically linked executables when exec()ed by VFS . switch to netbsd variant of elf32.h exclusively, solves some conflicting headers	2012-04-16 00:41:42 +00:00
David van Moolenbroek	6aa61efd09	VBOX: add host/guest communication interface This interface can be used by other system processes by means of the newly provided vbox API in libsys.	2012-04-09 15:56:20 +02:00
David van Moolenbroek	70abb127cc	Add sys_vumap() kernel call This new call is a vectored version of sys_umap(). It supports batch lookups, non-contiguous memory, faulting in memory, and basic access checks.	2012-03-24 19:51:13 +01:00
David van Moolenbroek	f140910d3c	Clean up a stale a.out-related declaration	2012-03-19 00:10:18 +01:00
Tomas Hruby	72b7abd1a1	VFS - no CANCEL for async non-blocking operations - if an operation (R, W, IOCTL) is non blocking, a flag is set and sent to the device. - nothing changes for sync devices - asyn devices should reply asap if an operation is non-blocking. We must trust the devices, but we had to trust them anyway to reply to CANCEL correctly - we safe sending CANCEL commands to asyn devices. This greatly simplifies the protocol. Asynchronous devices can always reply when a reply is ready and do not need to deal with other situations - currently, none of our drivers use the flags since they drive virtual devices which do not block	2012-03-02 15:44:48 +00:00
Ben Gras	2fe8fb192f	Full switch to clang/ELF. Drop ack. Simplify. There is important information about booting non-ack images in docs/UPDATING. ack/aout-format images can't be built any more, and booting clang/ELF-format ones is a little different. Updating to the new boot monitor is recommended. Changes in this commit: . drop boot monitor -> allowing dropping ack support . facility to copy ELF boot files to /boot so that old boot monitor can still boot fairly easily, see UPDATING . no more ack-format libraries -> single-case libraries . some cleanup of OBJECT_FMT, COMPILER_TYPE, etc cases . drop several ack toolchain commands, but not all support commands (e.g. aal is gone but acksize is not yet). . a few libc files moved to netbsd libc dir . new /bin/date as minix date used code in libc/ . test compile fix . harmonize includes . /usr/lib is no longer special: without ack, /usr/lib plays no kind of special bootstrapping role any more and bootstrapping is done exclusively through packages, so releases depend even less on the state of the machine making them now. . rename nbsd_lib* to lib* . reduce mtree	2012-02-14 14:52:02 +01:00
Gianluca Guida	fa59fc6eb4	Move shared headers in common/include Headers that will be shared between old includes and NetBSD-like includes are moved into common/include tree. They are still copied in /usr/include in 'make includes', so compilation and programs aren't be affected.	2011-02-06 22:59:02 +00:00
Ben Gras	0203ea37bf	include - throw out gettiminglocks stuff from include	2011-02-04 13:42:54 +00:00
Dirk Vogt	c22564335f	Added possibility to inject input events to tty M include/Makefile A include/minix/input.h M include/minix/com.h M drivers/tty/keyboard.c M drivers/tty/tty.c M drivers/tty/tty.h M include/minix/syslib.h M lib/libsys/Makefile A lib/libsys/input.c	2010-11-17 14:53:07 +00:00
Tomas Hruby	ac780f36a0	sys_getcpuinfo()	2010-10-26 21:07:50 +00:00
Tomas Hruby	74c5cd7668	The profile utility can set the sprofiling mode - profile --nmi \| --rtc sets the profiling mode - --rtc is default, uses BIOS RTC, cannot profile kernel the presetted frequency values apply - --nmi is only available in APIC mode as it uses the NMI watchdog, -f allows any frequency in Hz - both modes use compatible data structures	2010-09-23 10:49:42 +00:00
Tomas Hruby	a665ae3de1	Userspace scheduling - exporting stats - contributed by Bjorn Swift - adds process accounting, for example counting the number of messages sent, how often the process was preemted and how much time it spent in the run queue. These statistics, along with the current cpu load, are sent back to the user-space scheduler in the Out Of Quantum message. - the user-space scheduler may choose to make use of these statistics when making scheduling decisions. For isntance the cpu load becomes especially useful when scheduling on multiple cores.	2010-09-19 15:52:12 +00:00
Tomas Hruby	6513d20744	SMP - Process is stopped when VM modifies the page tables - RTS_VMINHIBIT flag is used to stop process while VM is fiddling with its pagetables - more generic way of sending synchronous scheduling events among cpus - do the x-cpu smp sched calls only if the target process is runnable. If it is not, it cannot be running and it cannot become runnable this CPU holds the BKL	2010-09-15 14:11:12 +00:00
Tomas Hruby	06b6e5624a	SMP - Changed prototype of sys_schedule() - sys_schedule can change only selected values, -1 means that the current value should be kept unchanged. For instance we mostly want to change the scheduling quantum and priority but we want to keep the process at the current cpu - RS can hand off its processes to scheduler - service can read the destination cpu from system.conf - RS can pass the information farther	2010-09-15 14:10:42 +00:00
David van Moolenbroek	354da24f5b	make getsysinfo() a system-land call	2010-09-14 21:50:05 +00:00
Ben Gras	5d6c2aae0a	gcov support, based on work contributed by Anton Kuijsten.	2010-08-25 13:06:43 +00:00
Cristiano Giuffrida	91a83fe265	Crash recovery and live update support for VM.	2010-07-20 23:03:52 +00:00
Thomas Veerman	ecc8a52f82	Add getnucred system call. Contributed by Thomas Cort	2010-07-15 13:24:57 +00:00
Cristiano Giuffrida	8cedace2f5	Scheduling parameters out of the kernel.	2010-07-13 15:30:17 +00:00
Cristiano Giuffrida	8427d774b6	RS live update support.	2010-07-09 18:29:04 +00:00
Cristiano Giuffrida	1f8dbed029	RS crash recovery support.	2010-07-06 22:05:21 +00:00
Cristiano Giuffrida	3de6a807ce	Configure settings for system services dynamically with the new service edit command.	2010-07-05 19:37:08 +00:00
David van Moolenbroek	2488cc6442	PCI: expose BAR sizes	2010-07-01 09:10:16 +00:00
Erik van der Kouwe	23284ee7bd	User-space scheduling for system processes	2010-07-01 08:32:33 +00:00
Cristiano Giuffrida	06700d05d1	Give RS a page table.	2010-06-28 21:53:37 +00:00
Cristiano Giuffrida	869a223d43	service clone command to clone system services on demand.	2010-06-28 21:38:29 +00:00
Ben Gras	fc01683584	include, vfs: statvfs, fstatvfs calls, contributed by Buccapatnam Tirumala, Gautam.	2010-06-23 23:53:50 +00:00
Arun Thomas	1bf6d23f34	Make exec() use entry point in a.out header	2010-06-10 14:59:10 +00:00
Arun Thomas	4c10a31440	Remove legacy MM, FS, and FS_PROC_NR macros	2010-06-08 13:58:01 +00:00
Tomas Hruby	b09bcf6779	Scheduling server (by Bjorn Swift) In this second phase, scheduling is moved from PM to its own scheduler (see r6557 for phase one). In the next phase we hope to a) include useful information in the "out of quantum" message and b) create some simple scheduling policy that makes use of that information. When the system starts up, PM will iterate over its process table and ask SCHED to take over scheduling unprivileged processes. This is done by sending a SCHEDULING_START message to SCHED. This message includes the processes endpoint, the parent's endpoint and its nice level. The scheduler adds this process to its schedproc table, issues a schedctl, and returns its own endpoint to PM - as the endpoint of the effective scheduler. When a process terminates, a SCHEDULING_STOP message is sent to the scheduler. The reason for this effective endpoint is for future compatibility. Some day, we may have a scheduler that, instead of scheduling the process itself, forwards the SCHEDULING_START message on to another scheduler. PM has information on who schedules whom. As such, scheduling messages from user-land are sent through PM. An example is when processes change their priority, using nice(). In that case, a getsetpriority message is sent to PM, which then sends a SCHEDULING_SET_NICE to the process's effective scheduler. When a process is forked through PM, it inherits its parent's scheduler, but is spawned with an empty quantum. As before, a request to fork a process flows through VM before returning to PM, which then wakes up the child process. This flow has been modified slightly so that PM notifies the scheduler of the new process, before waking up the child process. If the scheduler fails to take over scheduling, the child process is torn down and the fork fails with an erroneous value. Process priority is entirely decided upon using nice levels. PM stores a copy of each process's nice level and when a child is forked, its parent's nice level is sent in the SCHEDULING_START message. How this level is mapped to a priority queue is up to the scheduler. It should be noted that the nice level is used to determine the max_priority and the parent could have been in a lower priority when it was spawned. To prevent a CPU intensive process from hawking the CPU by continuously forking children that get scheduled in the max_priority, the scheduler should determine in which queue the parent is currently scheduled, and schedule the child in that same queue. Other fixes: The USER_Q in kernel/proc.h was incorrectly defined as NR_SCHED_QUEUES/2. That results in a "off by one" error when converting priority->nice->priority for nice=0. This also had the side effect that if someone were to set the MAX_USER_Q to something else than 0, then USER_Q would be off.	2010-05-18 13:39:04 +00:00
David van Moolenbroek	9ba65d2ea8	This patch switches the MINIX3 ethernet driver stack from a port-based model to an instance-based model. Each ethernet driver instance is now responsible for exactly one network interface card. The port field in /etc/inet.conf now acts as an instance field instead. This patch also updates the data link protocol. This update: - eliminates the concept of ports entirely; - eliminates DL_GETNAME entirely; - standardizes on using m_source for IPC and DL_ENDPT for safecopies; - removes error codes from TASK/STAT replies, as they were unused; - removes a number of other old or unused fields; - names and renames a few other fields. All ethernet drivers have been changed to: - conform to the new protocol, and exactly that; - take on an instance number based on a given "instance" argument; - skip that number of PCI devices in probe iterations; - use config tables and environment variables based on that number; - no longer be limited to a predefined maximum of cards in any way; - get rid of any leftover non-safecopy support and other ancient junk; - have a correct banner protocol figure, or none at all. Other changes: * Inet.conf is now taken to be line-based, and supports #-comments. No existing installations are expected to be affected by this. * A new, select-based asynchio library replaces the old one. Kindly contributed by Kees J. Bot. * Inet now supports use of select() on IP devices. Combined, the last two changes together speed up dhcpd considerably in the presence of multiple interfaces. * A small bug has been fixed in nonamed.	2010-05-17 22:22:53 +00:00
Ben Gras	c5c25e7abc	kernel/vm: change pde table info from single buffer to explicit per-process. makes code in kernel more readable, and allows better sanity checking on using the pde info.	2010-05-12 08:31:05 +00:00
Tomas Hruby	5c63cac05a	Removed defines not used since r6844.	2010-05-10 13:29:04 +00:00
Ben Gras	f78d8e74fd	secondary cache feature in vm. A new call to vm lets processes yield a part of their memory to vm, together with an id, getting newly allocated memory in return. vm is allowed to forget about it if it runs out of memory. processes can ask for it back using the same id. (These two operations are normally combined in a single call.) It can be used as a as-big-as-memory-will-allow block cache for filesystems, which is how mfs now uses it.	2010-05-05 11:35:04 +00:00
Cristiano Giuffrida	0164957abb	Unified crash recovery and live update. RS CHANGES: - Crash recovery is now implemented like live update. Two instances are kept side by side and the dead version is live updated into the new one. The endpoint doesn't change and the failure is not exposed (by default) to other system services. - The new instance can be created reactively (when a crash is detected) or proactively. In the latter case, RS can be instructed to keep a replica of the system service to perform a hot swap when the service fails. The flag SF_USE_REPL is set in that case. - The new flag SF_USE_REPL is supported for services in the boot image and dynamically started services through the RS interface (i.e. -p option in the service utility). - Fixed a free unallocated memory bug for core system services.	2010-04-27 11:17:30 +00:00
Tomas Hruby	f51eea4b32	Changed pagefault delivery to VM this patch changes the way pagefaults are delivered to VM. It adopts the same model as the out-of-quantum messages sent by kernel to a scheduler. - everytime a userspace pagefault occurs, kernel creates a message which is sent to VM on behalf of the faulting process - the process is blocked on delivery to VM in the standard IPC code instead of waiting in a spacial in-kernel queue (stack) and is not runnable until VM tell kernel that the pagefault is resolved and is free to clear the RTS_PAGEFAULT flag. - VM does not need call kernel and poll the pagefault information which saves many (1/2?) calls and kernel calls that return "no more data" - VM notification by kernel does not need to use signals - each entry in proc table is by 12 bytes smaller (~3k save)	2010-04-26 23:21:26 +00:00

1 2 3 4

172 Commits