• Linus Torvalds's avatar
    pipes: add a "packetized pipe" mode for writing · 9883035a
    Linus Torvalds authored
    
    
    The actual internal pipe implementation is already really about
    individual packets (called "pipe buffers"), and this simply exposes that
    as a special packetized mode.
    
    When we are in the packetized mode (marked by O_DIRECT as suggested by
    Alan Cox), a write() on a pipe will not merge the new data with previous
    writes, so each write will get a pipe buffer of its own.  The pipe
    buffer is then marked with the PIPE_BUF_FLAG_PACKET flag, which in turn
    will tell the reader side to break the read at that boundary (and throw
    away any partial packet contents that do not fit in the read buffer).
    
    End result: as long as you do writes less than PIPE_BUF in size (so that
    the pipe doesn't have to split them up), you can now treat the pipe as a
    packet interface, where each read() system call will read one packet at
    a time.  You can just use a sufficiently big read buffer (PIPE_BUF is
    sufficient, since bigger than that doesn't guarantee atomicity anyway),
    and the return value of the read() will naturally give you the size of
    the packet.
    
    NOTE! We do not support zero-sized packets, and zero-sized reads and
    writes to a pipe continue to be no-ops.  Also note that big packets will
    currently be split at write time, but that the size at which that
    happens is not really specified (except that it's bigger than PIPE_BUF).
    Currently that limit is the system page size, but we might want to
    explicitly support bigger packets some day.
    
    The main user for this is going to be the autofs packet interface,
    allowing us to stop having to care so deeply about exact packet sizes
    (which have had bugs with 32/64-bit compatibility modes).  But user
    space can create packetized pipes with "pipe2(fd, O_DIRECT)", which will
    fail with an EINVAL on kernels that do not support this interface.
    Tested-by: default avatarMichael Tokarev <mjt@tls.msk.ru>
    Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
    Cc: David Miller <davem@davemloft.net>
    Cc: Ian Kent <raven@themaw.net>
    Cc: Thomas Meyer <thomas@m3y3r.de>
    Cc: stable@kernel.org  # needed for systemd/autofs interaction fix
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    9883035a