If a select(2) call was issued on a file descriptor for which the file
pointer was closed due to invalidation (FILP_CLOSED), typically as the
result of a character/socket driver dying, the call would previously
return with an error: EINTR upon call entry or EIO on invalidation at
at a later time. Especially the former could severely confuse
applications, which would assume the call was interrupted by a signal,
restart the select call and immediately get EINTR again, ad infinitum.
This patch changes the select(2) semantics such that for closed filps,
the file descriptor is returned as readable and/or writable (depending
on the requested operations), as such letting the entire select call
finish successfully. Applications will then typically attempt to read
from and/or write to the file descriptor, resulting in an I/O error
that they should generally be better equipped to handle.
This patch also fixes a potential problem with returning early from a
select(2) call if a bad file descriptor is given: previously, in such
cases not all actions taken so far would be undone; now they are.
Change-Id: Ia6581f8789473a8a6c200852fccf552691a17025
This patch adds the implementation of the BSD socket system calls
which have been introduced in an earlier patch. At the same time, it
adds support for communication with socket drivers, using a new
"socket device" (SDEV_) protocol. These two parts, implemented in
socket.c and sdev.c respectively, form the upper and lower halves of
the new BSD socket support in VFS. New mapping functionality for
socket domains and drivers is added as well, implemented in smap.c.
The rest of the changes mainly facilitate the separation of character
and socket driver calls, and do not make any fundamental alterations.
For example, while this patch changes VFS's select.c rather heavily,
the new select logic for socket drivers is the exact same as for
character drivers; the changes mainly separate the driver type
specific parts from the generic select logic further than before.
Change-Id: I2f13084dd3c8d3a68bfc69da0621120c8291f707
By now it has become clear that the VFS select code has an unusually
high concentration of bugs, and there is no indication that any form
of convergence to a bug-free state is in sight. Thus, for now, it
may be helpful to be able to dump the contents of the select tables
in order to track down any bugs in the future. Hopefully that will
allow the next bugs to be resolved slightly after than before.
The debug dump can be triggered with "svrctl vfs get print_select".
Change-Id: Ia826746dce0f065d7f3b46aa9047945067b8263d
A select query could deadlock if..
- it was querying a character or socket device that, at the start of
the select query, was not known to be ready for the requested
operations;
- this device could not be checked immediately, due to another ongoing
query to the same character or socket driver;
- the select query had a timer that triggered before the device could
be checked, thereby changing the select query to non-blocking.
In this situation, a missing flag check would cause the select code to
conclude erroneously that the operations which it flagged for later,
were satisfied. At the same time, the same flag remained set, so that
the select query would continue to wait for that device. This
resulted in a deadlock. The same bug could most likely be triggered
through other scenarios that were even less likely to occur.
This patch fixes the race condition and puts in a hopefully slightly
more informative comment for the affected block of code.
In practice, the bug could be triggered fairly reliably by generating
lots of output in tmux.
Change-Id: I1c909255dcf552e6c7cef08b0cf5cbc41294b99c
Now that clock_t is an unsigned value, we can also allow the system
uptime to wrap. Essentially, instead of using (a <= b) to see if time
a occurs no later than time b, we use (b - a <= CLOCK_MAX / 2). The
latter value does not exist, so instead we add TMRDIFF_MAX for that
purpose.
We must therefore also avoid using values like 0 and LONG_MAX as
special values for absolute times. This patch extends the libtimers
interface so that it no longer uses 0 to indicate "no timeout".
Similarly, TMR_NEVER is now used as special value only when
otherwise a relative time difference would be used. A minix_timer
structure is now considered in use when it has a watchdog function set,
rather than when the absolute expiry time is not TMR_NEVER. A few new
macros in <minix/timers.h> help with timer comparison and obtaining
properties from a minix_timer structure.
This patch also eliminates the union of timer arguments, instead using
the only union element that is only used (the integer). This prevents
potential problems with e.g. live update. The watchdog function
prototype is changed to pass in the argument value rather than a
pointer to the timer structure, since obtaining the argument value was
the only current use of the timer structure anyway. The result is a
somewhat friendlier timers API.
The VFS select code required a few more invasive changes to restrict
the timer value to the new maximum, effectively matching the timer
code in PM. As a side effect, select(2) has been changed to reject
invalid timeout values. That required a change to the test set, which
relied on the previous, erroneous behavior.
Finally, while we're rewriting significant chunks of the timer code
anyway, also covert it to KNF and add a few more explanatory comments.
Change-Id: Id43165c3fbb140b32b90be2cca7f68dd646ea72e
Some select queries require a response from device drivers. If a
select call is nonblocking (with a zero timeout), the response to
the caller may have to be deferred until all involved drivers have
responded to the initial query. This is handled just fine.
However, if the select call has a timeout that is so short that it
triggers before all the involved drivers have responded, the
resulting alarm would be discarded, possibly resulting in the call
blocking forever. This fix changes the alarm handler such that if
the alarm triggers too early, the select call is further handled
as though it was nonblocking.
This fix resolves a test77 deadlock on really slow systems.
Change-Id: Ib487c8fe436802c3e11c57355ae0c8480721f06e
The remapping from /dev/tty to the real controlling terminal in the
device code was confusing the select code. The latter is now aware
of this case and should handle it properly, at the cost of one extra
field in the filp structure.
There is a nasty, hopefully sufficiently rare case of /dev/tty being
kept open while controlling terminals are changing, that we are still
not handling. Doing so would require more than just a few changes,
but the code should at least detect and cleanly fail on this case.
Test77 now has a basic test set for selecting on /dev/tty.
Change-Id: Iaedea449cdb728d0e66a9de8faacdfd9638dfe92