1. 19 Feb, 2012 1 commit
    • David Howells's avatar
      Replace the fd_sets in struct fdtable with an array of unsigned longs · 1fd36adc
      David Howells authored
      
      Replace the fd_sets in struct fdtable with an array of unsigned longs and then
      use the standard non-atomic bit operations rather than the FD_* macros.
      
      This:
      
       (1) Removes the abuses of struct fd_set:
      
           (a) Since we don't want to allocate a full fd_set the vast majority of the
           	 time, we actually, in effect, just allocate a just-big-enough array of
           	 unsigned longs and cast it to an fd_set type - so why bother with the
           	 fd_set at all?
      
           (b) Some places outside of the core fdtable handling code (such as
           	 SELinux) want to look inside the array of unsigned longs hidden inside
           	 the fd_set struct for more efficient iteration over the entire set.
      
       (2) Eliminates the use of FD_*() macros in the kernel completely.
      
       (3) Permits the __FD_*() macros to be deleted entirely where not exposed to
           userspace.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Link: http://lkml.kernel.org/r/20120216174954.23314.48147.stgit@warthog.procyon.org.uk
      
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      1fd36adc
  2. 21 Mar, 2011 1 commit
  3. 13 Jan, 2011 1 commit
  4. 28 Oct, 2010 2 commits
  5. 12 Mar, 2010 1 commit
  6. 06 Mar, 2010 1 commit
  7. 04 Oct, 2009 1 commit
  8. 23 Sep, 2009 1 commit
  9. 16 Aug, 2009 1 commit
  10. 17 Jun, 2009 1 commit
  11. 14 Jan, 2009 3 commits
  12. 13 Jan, 2009 1 commit
  13. 06 Jan, 2009 1 commit
    • Tejun Heo's avatar
      poll: allow f_op->poll to sleep · 5f820f64
      Tejun Heo authored
      
      f_op->poll is the only vfs operation which is not allowed to sleep.  It's
      because poll and select implementation used task state to synchronize
      against wake ups, which doesn't have to be the case anymore as wait/wake
      interface can now use custom wake up functions.  The non-sleep restriction
      can be a bit tricky because ->poll is not called from an atomic context
      and the result of accidentally sleeping in ->poll only shows up as
      temporary busy looping when the timing is right or rather wrong.
      
      This patch converts poll/select to use custom wake up function and use
      separate triggered variable to synchronize against wake up events.  The
      only added overhead is an extra function call during wake up and
      negligible.
      
      This patch removes the one non-sleep exception from vfs locking rules and
      is beneficial to userland filesystem implementations like FUSE, 9p or
      peculiar fs like spufs as it's very difficult for those to implement
      non-sleeping poll method.
      
      While at it, make the following cosmetic changes to make poll.h and
      select.c checkpatch friendly.
      
      * s/type * symbol/type *symbol/		   : three places in poll.h
      * remove blank line before EXPORT_SYMBOL() : two places in select.c
      
      Oleg: spotted missing barrier in poll_schedule_timeout()
      Davide: spotted missing write barrier in pollwake()
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Eric Van Hensbergen <ericvh@gmail.com>
      Cc: Ron Minnich <rminnich@sandia.gov>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      Cc: Davide Libenzi <davidel@xmailserver.org>
      Cc: Brad Boyer <flar@allandria.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Davide Libenzi <davidel@xmailserver.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5f820f64
  14. 26 Oct, 2008 1 commit
  15. 07 Sep, 2008 2 commits
  16. 06 Sep, 2008 3 commits
  17. 22 Jun, 2008 1 commit
  18. 01 May, 2008 2 commits
  19. 30 Apr, 2008 2 commits
  20. 21 Apr, 2008 1 commit
  21. 06 Feb, 2008 1 commit
  22. 19 Oct, 2007 1 commit
  23. 17 Oct, 2007 3 commits
  24. 12 Sep, 2007 1 commit
    • Alexey Dobriyan's avatar
      Fix select on /proc files without ->poll · dd23aae4
      Alexey Dobriyan authored
      Taneli Vähäkangas <vahakang@cs.helsinki.fi> reported that commit
      786d7e16
      
       aka "Fix rmmod/read/write races
      in /proc entries" broke SBCL + SLIME combo.
      
      The old code in do_select() used DEFAULT_POLLMASK, if couldn't find
      ->poll handler.  The new code makes ->poll always there and returns 0 by
      default, which is not correct.  Return DEFAULT_POLLMASK instead.
      
      Steps to reproduce:
      
      	install emacs, SBCL, SLIME
      	emacs
      	M-x slime	in *inferior-lisp* buffer
      	[watch it doing "Connecting to Swank on port X.."]
      
      Please, apply before 2.6.23.
      
      P.S.: why SBCL can't just read(2) /proc/cpuinfo is a mystery.
      Signed-off-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Cc: T Taneli Vahakangas <vahakang@cs.helsinki.fi>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dd23aae4
  25. 09 May, 2007 1 commit
  26. 08 May, 2007 2 commits
  27. 10 Dec, 2006 1 commit
    • Vadim Lobanov's avatar
      [PATCH] fdtable: Make fdarray and fdsets equal in size · bbea9f69
      Vadim Lobanov authored
      
      Currently, each fdtable supports three dynamically-sized arrays of data: the
      fdarray and two fdsets.  The code allows the number of fds supported by the
      fdarray (fdtable->max_fds) to differ from the number of fds supported by each
      of the fdsets (fdtable->max_fdset).
      
      In practice, it is wasteful for these two sizes to differ: whenever we hit a
      limit on the smaller-capacity structure, we will reallocate the entire fdtable
      and all the dynamic arrays within it, so any delta in the memory used by the
      larger-capacity structure will never be touched at all.
      
      Rather than hogging this excess, we shouldn't even allocate it in the first
      place, and keep the capacities of the fdarray and the fdsets equal.  This
      patch removes fdtable->max_fdset.  As an added bonus, most of the supporting
      code becomes simpler.
      Signed-off-by: default avatarVadim Lobanov <vlobanov@speakeasy.net>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Dipankar Sarma <dipankar@in.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      bbea9f69
  28. 29 Sep, 2006 1 commit
    • Chris Snook's avatar
      [PATCH] enforce RLIMIT_NOFILE in poll() · 4e6fd33b
      Chris Snook authored
      
      POSIX states that poll() shall fail with EINVAL if nfds > OPEN_MAX.  In
      this context, POSIX is referring to sysconf(OPEN_MAX), which is the value
      of current->signal->rlim[RLIMIT_NOFILE].rlim_cur in the linux kernel, not
      the compile-time constant which happens to also be named OPEN_MAX.  In the
      current code, an application may poll up to max_fdset file descriptors,
      even if this exceeds RLIMIT_NOFILE.  The current code also breaks
      applications which poll more than max_fdset descriptors, which worked circa
      2.4.18 when the check was against NR_OPEN, which is 1024*1024.  This patch
      enforces the limit precisely as POSIX defines, even if RLIMIT_NOFILE has
      been changed at run time with ulimit -n.
      
      To elaborate on the rationale for this, there are three cases:
      
      1) RLIMIT_NOFILE is at the default value of 1024
      
      In this (default) case, the patch changes nothing.  Calls with nfds > 1024
      fail with EINVAL both before and after the patch, and calls with nfds <=
      1024 pass the check both before and after the patch, since 1024 is the
      initial value of max_fdset.
      
      2) RLIMIT_NOFILE has been raised above the default
      
      In this case, poll() becomes more permissive, allowing polling up to
      RLIMIT_NOFILE file descriptors even if less than 1024 have been opened.
      The patch won't introduce new errors here.  If an application somehow
      depends on poll() failing when it polls with duplicate or invalid file
      descriptors, it's already broken, since this is already allowed below 1024,
      and will also work above 1024 if enough file descriptors have been open at
      some point to cause max_fdset to have been increased above nfds.
      
      3) RLIMIT_NOFILE has been lowered below the default
      
      In this case, the system administrator or the user has gone out of their
      way to protect the system from inefficient (or malicious) applications
      wasting kernel memory.  The current code allows polling up to 1024 file
      descriptors even if RLIMIT_NOFILE is much lower, which is not what the user
      or administrator intended.  Well-written applications which only poll
      valid, unique file descriptors will never notice the difference, because
      they'll hit the limit on open() first.  If an application gets broken
      because of the patch in this case, then it was already poorly/maliciously
      designed, and allowing it to work in the past was a violation of POSIX and
      a DoS risk on low-resource systems.
      
      With this patch, poll() will permit exactly what POSIX suggests, no more,
      no less, and for any run-time value set with ulimit -n, not just 256 or
      1024.  There are existing apps which which poll a large number of file
      descriptors, some of which may be invalid, and if those numbers stradle
      1024, they currently fail with or without the patch in -mm, though they
      worked fine under 2.4.18.
      Signed-off-by: default avatarChris Snook <csnook@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      4e6fd33b
  29. 25 Jun, 2006 1 commit