- 18 Apr, 2011 1 commit
-
-
Bob Peterson authored
This patch fixes a deadlock in GFS2 where two processes are trying to reclaim an unlinked dinode: One holds the inode glock and calls gfs2_lookup_by_inum trying to look up the inode, which it can't, due to I_FREEING. The other has set I_FREEING from vfs and is at the beginning of gfs2_delete_inode waiting for the glock, which is held by the first. The solution is to add a new non_block parameter to the gfs2_iget function that causes it to return -ENOENT if the inode is being freed. Signed-off-by:
Bob Peterson <rpeterso@redhat.com> Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
- 18 Jan, 2011 1 commit
-
-
Steven Whitehouse authored
In the (impossible, except if there is fs corruption) error path in gfs2_lookup_by_inum() if the call to gfs2_inode_refresh() fails, it was leaving the function by calling iput() rather than iget_failed(). This would cause future lookups of the same inode to block forever. This patch fixes the problem by moving the call to gfs2_inode_refresh() into gfs2_inode_lookup() where iget_failed() is part of the error path already. Also this cleans up some unreachable code and makes gfs2_set_iop() static. Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
- 07 Jan, 2011 1 commit
-
-
Nick Piggin authored
Signed-off-by:
Nick Piggin <npiggin@kernel.dk>
-
- 15 Nov, 2010 1 commit
-
-
Steven Whitehouse authored
This area of the code has always been a bit delicate due to the subtleties of lock ordering. The problem is that for "normal" alloc/dealloc, we always grab the inode locks first and the rgrp lock later. In order to ensure no races in looking up the unlinked, but still allocated inodes, we need to hold the rgrp lock when we do the lookup, which means that we can't take the inode glock. The solution is to borrow the technique already used by NFS to solve what is essentially the same problem (given an inode number, look up the inode carefully, checking that it really is in the expected state). We cannot do that directly from the allocation code (lock ordering again) so we give the job to the pre-existing delete workqueue and carry on with the allocation as normal. If we find there is no space, we do a journal flush (required anyway if space from a deallocation is to be released) which should block against the pending deallocations, so we should always get the space back. Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
- 20 Sep, 2010 2 commits
-
-
Benjamin Marzinski authored
This patch adds support for fallocate to gfs2. Since the gfs2 does not support uninitialized data blocks, it must write out zeros to all the blocks. However, since it does not need to lock any pages to read from, gfs2 can write out the zero blocks much more efficiently. On a moderately full filesystem, fallocate works around 5 times faster on average. The fallocate call also allows gfs2 to add blocks to the file without changing the filesize, which will make it possible for gfs2 to preallocate space for the rindex file, so that gfs2 can grow a completely full filesystem. Signed-off-by:
Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
Steven Whitehouse authored
With the update of the truncate code, ip->i_disksize and inode->i_size are merely copies of each other. This means we can remove ip->i_disksize and use inode->i_size exclusively reducing the size of a GFS2 inode by 8 bytes. Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
- 21 May, 2010 1 commit
-
-
Bob Peterson authored
The previous patch I wrote for reclaiming unlinked dinodes had some shortcomings and did not prevent all hangs. This version is much cleaner and more logical, and has passed very difficult testing. Sorry for the churn. Signed-off-by:
Bob Peterson <rpeterso@redhat.com> Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
- 14 Apr, 2010 1 commit
-
-
Bob Peterson authored
This patch fixes a couple gfs2 problems with the reclaiming of unlinked dinodes. First, there were a couple of livelocks where everything would come to a halt waiting for a glock that was seemingly held by a process that no longer existed. In fact, the process did exist, it just had the wrong pid number in the holder information. Second, there was a lock ordering problem between inode locking and glock locking. Third, glock/inode contention could sometimes cause inodes to be improperly marked invalid by iget_failed. Signed-off-by:
Bob Peterson <rpeterso@redhat.com>
-
- 22 May, 2009 4 commits
-
-
Steven Whitehouse authored
Another function which is only called from one ops_inode.c so we move it and make it static. Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
Steven Whitehouse authored
Move gfs2_readlinki into ops_inode.c and make it static Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
Steven Whitehouse authored
Move gfs2_rmdiri() into ops_inode.c and make it static. Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
Steven Whitehouse authored
This patch renames the ops_*.c files which have no counterpart without the ops_ prefix in order to shorten the name and make it more readable. In addition, ops_address.h (which was very small) is moved into inode.h and inode.h is cleaned up by adding extern where required. Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
- 15 Apr, 2009 1 commit
-
-
Christoph Hellwig authored
Remove the weird pointer to file_operations mess and replace it with straight-forward defining of the lockinginstance names to the _nolock variants. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
- 24 Mar, 2009 1 commit
-
-
Steven Whitehouse authored
This is the big patch that I've been working on for some time now. There are many reasons for wanting to make this change such as: o Reducing overhead by eliminating duplicated fields between structures o Simplifcation of the code (reduces the code size by a fair bit) o The locking interface is now the DLM interface itself as proposed some time ago. o Fewer lookups of glocks when processing replies from the DLM o Fewer memory allocations/deallocations for each glock o Scope to do further optimisations in the future (but this patch is more than big enough for now!) Please note that (a) this patch relates to the lock_dlm module and not the DLM itself, that is still a separate module; and (b) that we retain the ability to build GFS2 as a standalone single node filesystem with out requiring the DLM. This patch needs a lot of testing, hence my keeping it I restarted my -git tree after the last merge window. That way, this has the maximum exposure before its merged. This is (modulo a few minor bug fixes) the same patch that I've been posting on and off the the last three months and its passed a number of different tests so far. Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
- 05 Jan, 2009 2 commits
-
-
Steven Whitehouse authored
The final field in gfs2_dinode_host was the i_flags field. Thats renamed to i_diskflags in order to avoid confusion with the existing inode flags, and moved into the inode proper at a suitable location to avoid creating a "hole". At that point struct gfs2_dinode_host is no longer needed and as promised (quite some time ago!) it can now be removed completely. Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
Steven Whitehouse authored
Move the contents of some headers which contained very little into more sensible places, and remove the original header files. This should make it easier to find things. Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
- 18 Sep, 2008 1 commit
-
-
Steven Whitehouse authored
Until now, we've used the same scheme as GFS1 for atime. This has failed since atime is a per vfsmnt flag, not a per fs flag and as such the "noatime" flag was not getting passed down to the filesystems. This patch removes all the "special casing" around atime updates and we simply use the VFS's atime code. The net result is that GFS2 will now support all the same atime related mount options of any other filesystem on a per-vfsmnt basis. We do lose the "lazy atime" updates, but we gain "relatime". We could add lazy atime to the VFS at a later date, if there is a requirement for that variant still - I suspect relatime will be enough. Also we lose about 100 lines of code after this patch has been applied, and I have a suspicion that it will speed things up a bit, even when atime is "on". So it seems like a nice clean up as well. From a user perspective, everything stays the same except the loss of the per-fs atime quantum tweekable (ought to be per-vfsmnt at the very least, and to be honest I don't think anybody ever used it) and that a number of options which were ignored before now work correctly. Please let me know if you've got any comments. I'm pushing this out early so that you can all see what my plans are. Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
- 27 Aug, 2008 1 commit
-
-
Steven Whitehouse authored
This patch fixes a locking issue in the rename code by ensuring that we hold the per sb rename lock over both directory and "other" renames which involve different parent directories. At the same time, this moved the (only called from one place) function gfs2_ok_to_move into the file that its called from, so we can mark it static. This should make a code a bit easier to follow. Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com> Cc: Peter Staubach <staubach@redhat.com>
-
- 27 Jul, 2008 1 commit
-
-
Al Viro authored
Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
- 10 Jul, 2008 1 commit
-
-
Li Xiaodong authored
The implementation of gfs2_inode_attr_in is removed. So remove its declaration. Signed-off-by:
Li Xiaodong <lixd@cn.fujitsu.com> Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
- 03 Jul, 2008 1 commit
-
-
Miklos Szeredi authored
GFS2 calls permission() to verify permissions after locks on the files have been taken. For this it's sufficient to call gfs2_permission() instead. This results in the following changes: - IS_RDONLY() check is not performed - IS_IMMUTABLE() check is not performed - devcgroup_inode_permission() is not called - security_inode_permission() is not called IS_RDONLY() should be unnecessary anyway, as the per-mount read-only flag should provide protection against read-only remounts during operations. do_gfs2_set_flags() has been fixed to perform mnt_want_write()/mnt_drop_write() to protect against remounting read-only. IS_IMMUTABLE has been added to gfs2_permission() Repeating the security checks seems to be pointless, as they don't normally change, and if they do, it's independent of the filesystem state. Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz> Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
- 31 Mar, 2008 2 commits
-
-
Steven Whitehouse authored
The blocks counter is almost a duplicate of the i_blocks field in the VFS inode. The only difference is that i_blocks can be only 32bits long for 32bit arch without large single file support. Since GFS2 doesn't handle the non-large single file case (for 32 bit anyway) this adds a new config dependency on 64BIT || LSF. This has always been the case, however we've never explicitly said so before. Even if we do add support for the non-LSF case, we will still not require this field to be duplicated since we will not be able to access oversized files anyway. So the net result of all this is that we shave 8 bytes from a gfs2_inode and get our config deps correct. Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
Steven Whitehouse authored
This patch improves the calculation of the tree height in order to reduce the number of operations which are carried out on each call to gfs2_block_map. In the common case, we now make a single comparison, rather than calculating the required tree height from scratch each time. Also in the case that the tree does need some extra height, we start from the current height rather from zero when we work out what the new height ought to be. In addition the di_height field is moved into the inode proper and reduced in size to a u8 since the value must be between 0 and GFS2_MAX_META_HEIGHT (10). Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
- 25 Jan, 2008 2 commits
-
-
Steven Whitehouse authored
Just like ext3 we now have three sets of address space operations to cover the cases of writeback, ordered and journalled data writes. This means that the individual operations can now become less complicated as we are able to remove some of the tests for file data mode from the code. Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
Steven Whitehouse authored
This adds a function "gfs2_is_writeback()" along the lines of the existing "gfs2_is_jdata()" in order to clean up the code and make the various tests for the inode mode more obvious. It also fixes the PageChecked() logic where we were resetting the flag too early in the case of an error path. Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
- 10 Oct, 2007 1 commit
-
-
Benjamin Marzinski authored
There is a possible deadlock between two processes on the same node, where one process is deleting an inode, and another process is looking for allocated but unused inodes to delete in order to create more space. process A does an iput() on inode X, and it's i_count drops to 0. This causes iput_final() to be called, which puts an inode into state I_FREEING at generic_delete_inode(). There no point between when iput_final() is called, and when I_FREEING is set where GFS2 could acquire any glocks. Once I_FREEING is set, no other process on that node can successfully look up that inode until the delete finishes. process B locks the the resource group for the same inode in get_local_rgrp(), which is called by gfs2_inplace_reserve_i() process A tries to lock the resource group for the inode in gfs2_dinode_dealloc(), but it's already locked by process B process B waits in find_inode for the inode to have the I_FREEING state cleared. Deadlock. This patch solves the problem by adding an alternative to gfs2_iget(), gfs2_iget_skip(), that simply skips any inodes that are in the I_FREEING state.o The alternate test function is just like the original one, except that it fails if the inode is being freed, and sets a skipped flag. The alternate set function is just like the original, except that it fails if the skipped flag is set. Only try_rgrp_unlink() calls gfs2_iget_skip() instead of gfs2_iget(). Signed-off-by:
Benjamin E. Marzinski <bmarzins@redhat.com> Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
- 09 Jul, 2007 4 commits
-
-
Wendy Cheng authored
GFS2 has been passing i_mode within NFS File Handle. Other than the wrong assumption that there is always room for this extra 16 bit value, the current gfs2_get_dentry doesn't really need the i_mode to work correctly. Note that GFS2 NFS code does go thru the same lookup code path as direct file access route (where the mode is obtained from name lookup) but gfs2_get_dentry() is coded for different purpose. It is not used during lookup time. It is part of the file access procedure call. When the call is invoked, if on-disk inode is not in-memory, it has to be read-in. This makes i_mode passing a useless overhead. Signed-off-by:
S. Wendy Cheng <wcheng@redhat.com> Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
Wendy Cheng authored
GFS2 lookup code doesn't ask for inode shared glock. This implies during in-memory inode creation for existing file, GFS2 will not disk-read in the inode contents. This leaves no_formal_ino un-initialized during lookup time. The un-initialized no_formal_ino is subsequently encoded into file handle. Clients will get ESTALE error whenever it tries to access these files. Signed-off-by:
S. Wendy Cheng <wcheng@redhat.com> Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
Steven Whitehouse authored
This patch fixes some sign issues which were accidentally introduced into the quota & statfs code during the endianess annotation process. Also included is a general clean up which moves all of the _host structures out of gfs2_ondisk.h (where they should not have been to start with) and into the places where they are actually used (often only one place). Also those _host structures which are not required any more are removed entirely (which is the eventual plan for all of them). The conversion routines from ondisk.c are also moved into the places where they are actually used, which for almost every one, was just one single place, so all those are now static functions. This also cleans up the end of gfs2_ondisk.h which no longer needs the #ifdef __KERNEL__. The net result is a reduction of about 100 lines of code, many functions now marked static plus the bug fixes as mentioned above. For good measure I ran the code through sparse after making these changes to check that there are no warnings generated. This fixes Red Hat bz #239686 Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
Steven Whitehouse authored
This patch cleans up the inode number handling code. The main difference is that instead of looking up the inodes using a struct gfs2_inum_host we now use just the no_addr member of this structure. The tests relating to no_formal_ino can then be done by the calling code. This has advantages in that we want to do different things in different code paths if the no_formal_ino doesn't match. In the NFS patch we want to return -ESTALE, but in the ->lookup() path, its a bug in the fs if the no_formal_ino doesn't match and thus we can withdraw in this case. In order to later fix bz #201012, we need to be able to look up an inode without knowing no_formal_ino, as the only information that is known to us is the on-disk location of the inode in question. This patch will also help us to fix bz #236099 at a later date by cleaning up a lot of the code in that area. There are no user visible changes as a result of this patch and there are no changes to the on-disk format either. Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
- 05 Feb, 2007 3 commits
-
-
Adrian Bunk authored
On Thu, Jan 11, 2007 at 10:26:27PM -0800, Andrew Morton wrote: >... > Changes since 2.6.20-rc3-mm1: >... > git-gfs2-nmw.patch >... > git trees >... This patch makes the needlessly globlal gfs2_change_nlink_i() static. Signed-off-by:
Adrian Bunk <bunk@stusta.de> Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
S. Wendy Cheng authored
Second round of gfs2_rename lock re-ordering to allow Anaconda adding root partition on top of gfs2. Previous to this patch the recursive lock detector in glock.c can be triggered due to attempting to lock the rgrp twice. This fixes it by checking to see whether the rgrp is already locked. This fixes Red Hat bugzilla #221237 Signed-off-by:
S. Wendy Cheng <wcheng@redhat.com> Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
S. Wendy Cheng authored
Bugzilla 215088 Fix deadlock in gfs2_change_nlink() while installing RHEL5 into GFS2 partition. The gfs2_rename() apparently needs block allocation for the new name (into the directory) where it requires rg locks. At the same time, while updating the nlink count for the replaced file, gfs2_change_nlink() tries to return the inode meta-data back to resource group where it needs rg locks too. Our logic doesn't allow process to acquire these locks recursively by the same process (RHEL installer) that results a BUG call. This only happens within rename code path and only if the destination file exists before the rename operation. Signed-off-by:
S. Wendy Cheng <wcheng@redhat.com> Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
- 30 Nov, 2006 6 commits
-
-
Steven Whitehouse authored
The gfs2_glock_nq_m_atime function is unused in so far as its only ever called with num_gh = 1, and this falls through to the gfs2_glock_nq_atime function, so we might as well call that directly. Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
Steven Whitehouse authored
This function wasn't really doing the right thing. There was no need to update the inode size at this point and the updating of the i_blocks field has now been moved to the places where di_blocks is updated. A result of this patch and some those preceeding it is that unlocking a glock is now a much more efficient process, since there is no longer any requirement to copy data from the gfs2 inode into the vfs inode at this point. Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
Steven Whitehouse authored
Remove the di_[amc]time fields and use inode->i_[amc]time fields instead. This saves 24 bytes from the gfs2_inode. Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
Steven Whitehouse authored
This removes the duplicate di_mode field in favour of using the inode->i_mode field. This saves 4 bytes. Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
Steven Whitehouse authored
This removes the device numbers from this structure by using inode->i_rdev instead. It also cleans up the code in gfs2_mknod. It results in shrinking the gfs2_inode by 8 bytes. Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
Al Viro authored
Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
- 25 Sep, 2006 1 commit
-
-
Steven Whitehouse authored
As per Andrew Morton's request, removed trailing whitespace. Cc: Andrew Morton <akpm@osdl.org> Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-