1. 21 Nov, 2011 1 commit
    • Bob Peterson's avatar
      GFS2: move toward a generic multi-block allocator · 6e87ed0f
      Bob Peterson authored
      
      This patch is a revision of the one I previously posted.
      I tried to integrate all the suggestions Steve gave.
      The purpose of the patch is to change function gfs2_alloc_block
      (allocate either a dinode block or an extent of data blocks)
      to a more generic gfs2_alloc_blocks function that can
      allocate both a dinode _and_ an extent of data blocks in the
      same call. This will ultimately help us create a multi-block
      reservation scheme to reduce file fragmentation.
      
      This patch moves more toward a generic multi-block allocator that
      takes a pointer to the number of data blocks to allocate, plus whether
      or not to allocate a dinode. In theory, it could be called to allocate
      (1) a single dinode block, (2) a group of one or more data blocks, or
      (3) a dinode plus several data blocks.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      6e87ed0f
  2. 15 Nov, 2011 1 commit
  3. 08 Nov, 2011 1 commit
  4. 21 Oct, 2011 5 commits
    • Steven Whitehouse's avatar
      GFS2: Cache the most recently used resource group in the inode · 54335b1f
      Steven Whitehouse authored
      
      This means that after the initial allocation for any inode, the
      last used resource group is cached in the inode for future use.
      This drastically reduces the number of lookups of resource
      groups in the common case, and this the contention on that
      data structure.
      
      The allocation algorithm is the same as previously, except that we
      always check to see if the goal block is within the cached rgrp
      first before going to the rbtree to look one up.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      54335b1f
    • Steven Whitehouse's avatar
      GFS2: Make resource groups "append only" during life of fs · 8339ee54
      Steven Whitehouse authored
      
      Since we have ruled out supporting online filesystem shrink,
      it is possible to make the resource group list append only
      during the life of a super block. This gives several benefits:
      
      Firstly, we only need to read new rindex elements as they are added
      rather than needing to reread the whole rindex file each time one
      element is added.
      
      Secondly, the rindex glock can be held for much shorter periods of
      time, and is completely removed from the fast path for allocations.
      The lock is taken in shared mode only when updating the resource
      groups when the first allocation occurs, and after a grow has
      taken place.
      
      Thirdly, this results in a reduction in code size, and everything
      gets a lot simpler to understand in this area.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      8339ee54
    • Steven Whitehouse's avatar
      GFS2: Clean up gfs2_create · 9a63edd1
      Steven Whitehouse authored
      
      If we pass through knowledge of whether the creation is intended to be
      exclusive or not, then we can deal with that in gfs2_create_inode
      and remove one set of locking. Also this removes the loop in
      gfs2_create and simplifies the code a bit.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      9a63edd1
    • Steven Whitehouse's avatar
      GFS2: Use ->dirty_inode() · ab9bbda0
      Steven Whitehouse authored
      
      The aim of this patch is to use the newly enhanced ->dirty_inode()
      super block operation to deal with atime updates, rather than
      piggy backing that code into ->write_inode() as is currently
      done.
      
      The net result is a simplification of the code in various places
      and a reduction of the number of gfs2_dinode_out() calls since
      this is now implied by ->dirty_inode().
      
      Some of the mark_inode_dirty() calls have been moved under glocks
      in order to take advantage of then being able to avoid locking in
      ->dirty_inode() when we already have suitable locks.
      
      One consequence is that generic_write_end() now correctly deals
      with file size updates, so that we do not need a separate check
      for that afterwards. This also, indirectly, means that fdatasync
      should work correctly on GFS2 - the current code always syncs the
      metadata whether it needs to or not.
      
      Has survived testing with postmark (with and without atime) and
      also fsx.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      ab9bbda0
    • Steven Whitehouse's avatar
      GFS2: Fix inode allocation error path · 40ac218f
      Steven Whitehouse authored
      
      If we have got far enough through the inode allocation code
      path that an inode has already been allocated, then we must
      call iput to dispose of it, if an error occurs during a
      later part of the process. This will always be the final iput
      since there will be no other references to the inode.
      
      Unlike when the inode has been unlinked, its block state will
      be GFS2_BLKST_INODE rather than GFS2_BLKST_UNLINKED so we need
      to skip the test in ->evict_inode() for this one case in order
      to ensure that it will be deallocated correctly. This patch adds
      a new flag in order to ensure that this will happen correctly.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      40ac218f
  5. 25 Jul, 2011 1 commit
  6. 21 Jul, 2011 1 commit
  7. 20 Jul, 2011 3 commits
  8. 18 Jul, 2011 1 commit
    • Mimi Zohar's avatar
      security: new security_inode_init_security API adds function callback · 9d8f13ba
      Mimi Zohar authored
      
      This patch changes the security_inode_init_security API by adding a
      filesystem specific callback to write security extended attributes.
      This change is in preparation for supporting the initialization of
      multiple LSM xattrs and the EVM xattr.  Initially the callback function
      walks an array of xattrs, writing each xattr separately, but could be
      optimized to write multiple xattrs at once.
      
      For existing security_inode_init_security() calls, which have not yet
      been converted to use the new callback function, such as those in
      reiserfs and ocfs2, this patch defines security_old_inode_init_security().
      Signed-off-by: default avatarMimi Zohar <zohar@us.ibm.com>
      9d8f13ba
  9. 13 May, 2011 3 commits
  10. 10 May, 2011 1 commit
  11. 09 May, 2011 5 commits
    • Steven Whitehouse's avatar
      GFS2: Move most of the remaining inode.c into ops_inode.c · 194c011f
      Steven Whitehouse authored
      
      This is in preparation to remove inode.c and rename ops_inode.c
      to inode.c. Also most of the functions which were left in inode.c
      relate to the creation and lookup of inodes. I'm intending to work
      on consolidating some of that code, and its easier when its all in
      one place.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      194c011f
    • Steven Whitehouse's avatar
      GFS2: Remove gfs2_dinode_print() function · 94fb763b
      Steven Whitehouse authored
      
      This function was intended for debugging purposes, but it is not very
      useful. If we want to know what is on disk then all we need is a
      block number and gfs2_edit can give us much better information about
      what is there. Otherwise, if we are interested in what is stored in
      the in-core inode, it doesn't help us out there either.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      94fb763b
    • Steven Whitehouse's avatar
      GFS2: When adding a new dir entry, inc link count if it is a subdir · 3d6ecb7d
      Steven Whitehouse authored
      
      This adds an increment of the link count when we add a new directory
      entry, if that entry is itself a directory. This means that we no
      longer need separate code to perform this operation.
      
      Now that both adding and removing directory entries automatically
      update the parent directory's link count if required, that makes
      the code shorter and simpler than before.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      3d6ecb7d
    • Steven Whitehouse's avatar
      GFS2: Make gfs2_dir_del update link count when required · 855d23ce
      Steven Whitehouse authored
      
      When we remove an entry from a directory, we can save ourselves
      some trouble if we know the type of the entry in question, since
      if it is itself a directory, we can update the link count of the
      parent at the same time as removing the directory entry.
      
      In addition this patch also merges the rmdir and unlink code which
      was almost identical anyway. This eliminates the calls to remove
      the . and .. directory entries on each rmdir (not needed since the
      directory will be deallocated, anyway) which was the only thing preventing
      passing the dentry to gfs2_dir_del(). The passing of the dentry
      rather than just the name allows us to figure out the type of the entry
      which is being removed, and thus adjust the link count when required.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      855d23ce
    • Steven Whitehouse's avatar
      GFS2: Don't use gfs2_change_nlink in link syscall · 2baee03f
      Steven Whitehouse authored
      
      There are three users of gfs2_change_nlink which add to the link
      count. Two of these are about to be removed in later patches, so
      this means that there will no callers, when that happens allowing
      removal of that function, also in a later patch.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      2baee03f
  12. 05 May, 2011 1 commit
    • Steven Whitehouse's avatar
      GFS2: Double check link count under glock · d192a8e5
      Steven Whitehouse authored
      
      To avoid any possible races relating to the link count, we need to
      recheck it under the inode's glock in all cases where it matters.
      Also to ensure we never get any nasty surprises, this patch also
      ensures that once the link count has hit zero it can never be
      elevated by rereading in data from disk.
      
      The only place we cannot provide a proper solution is in rename
      in the case where we are removing a target inode and we discover
      that the target inode has been already unlinked on another node.
      The race window is very small, and we return EAGAIN in this case
      to indicate what has happened. The proper solution would be to move
      the lookup parts of rename from the vfs into library calls which
      the fs could call directly, but that is potentially a very big job
      and this fix should cover most cases for now.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      d192a8e5
  13. 21 Jan, 2011 1 commit
  14. 17 Jan, 2011 2 commits
    • Christoph Hellwig's avatar
      fallocate should be a file operation · 2fe17c10
      Christoph Hellwig authored
      
      Currently all filesystems except XFS implement fallocate asynchronously,
      while XFS forced a commit.  Both of these are suboptimal - in case of O_SYNC
      I/O we really want our allocation on disk, especially for the !KEEP_SIZE
      case where we actually grow the file with user-visible zeroes.  On the
      other hand always commiting the transaction is a bad idea for fast-path
      uses of fallocate like for example in recent Samba versions.   Given
      that block allocation is a data plane operation anyway change it from
      an inode operation to a file operation so that we have the file structure
      available that lets us check for O_SYNC.
      
      This also includes moving the code around for a few of the filesystems,
      and remove the already unnedded S_ISDIR checks given that we only wire
      up fallocate for regular files.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      2fe17c10
    • Christoph Hellwig's avatar
      make the feature checks in ->fallocate future proof · 64c23e86
      Christoph Hellwig authored
      
      Instead of various home grown checks that might need updates for new
      flags just check for any bit outside the mask of the features supported
      by the filesystem.  This makes the check future proof for any newly
      added flag.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      64c23e86
  15. 13 Jan, 2011 2 commits
  16. 07 Jan, 2011 2 commits
    • Nick Piggin's avatar
      fs: provide rcu-walk aware permission i_ops · b74c79e9
      Nick Piggin authored
      Signed-off-by: default avatarNick Piggin <npiggin@kernel.dk>
      b74c79e9
    • Nick Piggin's avatar
      fs: dcache reduce branches in lookup path · fb045adb
      Nick Piggin authored
      
      Reduce some branches and memory accesses in dcache lookup by adding dentry
      flags to indicate common d_ops are set, rather than having to check them.
      This saves a pointer memory access (dentry->d_op) in common path lookup
      situations, and saves another pointer load and branch in cases where we
      have d_op but not the particular operation.
      
      Patched with:
      
      git grep -E '[.>]([[:space:]])*d_op([[:space:]])*=' | xargs sed -e 's/\([^\t ]*\)->d_op = \(.*\);/d_set_d_op(\1, \2);/' -e 's/\([^\t ]*\)\.d_op = \(.*\);/d_set_d_op(\&\1, \2);/' -i
      Signed-off-by: default avatarNick Piggin <npiggin@kernel.dk>
      fb045adb
  17. 30 Nov, 2010 2 commits
  18. 26 Oct, 2010 2 commits
  19. 30 Sep, 2010 1 commit
    • Bob Peterson's avatar
      GFS2 fatal: filesystem consistency error on rename · 46290341
      Bob Peterson authored
      
      This patch fixes a GFS2 problem whereby the first rename after a
      mount can result in a file system consistency error being flagged
      improperly and cause the file system to withdraw.  The problem is
      that the rename code tries to run the rgrp list with function
      gfs2_blk2rgrpd before the rgrp list is guaranteed to be read in
      from disk.  The patch makes the rename function hold the rindex
      glock (as the gfs2_unlink code does today) which reads in the rgrp
      list if need be.  There were a total of three places in the rename
      code that improperly referenced the rgrp list without the rindex
      glock and this patch fixes all three.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      46290341
  20. 28 Sep, 2010 1 commit
  21. 20 Sep, 2010 3 commits