- 07 Jan, 2011 9 commits
-
-
Nick Piggin authored
Signed-off-by:
Nick Piggin <npiggin@kernel.dk>
-
Nick Piggin authored
Require filesystems be aware of .d_revalidate being called in rcu-walk mode (nd->flags & LOOKUP_RCU). For now do a simple push down, returning -ECHILD from all implementations. Signed-off-by:
Nick Piggin <npiggin@kernel.dk>
-
Nick Piggin authored
Reduce some branches and memory accesses in dcache lookup by adding dentry flags to indicate common d_ops are set, rather than having to check them. This saves a pointer memory access (dentry->d_op) in common path lookup situations, and saves another pointer load and branch in cases where we have d_op but not the particular operation. Patched with: git grep -E '[.>]([[:space:]])*d_op([[:space:]])*=' | xargs sed -e 's/\([^\t ]*\)->d_op = \(.*\);/d_set_d_op(\1, \2);/' -e 's/\([^\t ]*\)\.d_op = \(.*\);/d_set_d_op(\&\1, \2);/' -i Signed-off-by:
Nick Piggin <npiggin@kernel.dk>
-
Nick Piggin authored
RCU free the struct inode. This will allow: - Subsequent store-free path walking patch. The inode must be consulted for permissions when walking, so an RCU inode reference is a must. - sb_inode_list_lock to be moved inside i_lock because sb list walkers who want to take i_lock no longer need to take sb_inode_list_lock to walk the list in the first place. This will simplify and optimize locking. - Could remove some nested trylock loops in dcache code - Could potentially simplify things a bit in VM land. Do not need to take the page lock to follow page->mapping. The downsides of this is the performance cost of using RCU. In a simple creat/unlink microbenchmark, performance drops by about 10% due to inability to reuse cache-hot slab objects. As iterations increase and RCU freeing starts kicking over, this increases to about 20%. In cases where inode lifetimes are longer (ie. many inodes may be allocated during the average life span of a single inode), a lot of this cache reuse is not applicable, so the regression caused by this patch is smaller. The cache-hot regression could largely be avoided by using SLAB_DESTROY_BY_RCU, however this adds some complexity to list walking and store-free path walking, so I prefer to implement this at a later date, if it is shown to be a win in real situations. I haven't found a regression in any non-micro benchmark so I doubt it will be a problem. Signed-off-by:
Nick Piggin <npiggin@kernel.dk>
-
Nick Piggin authored
dcache_lock no longer protects anything. remove it. Signed-off-by:
Nick Piggin <npiggin@kernel.dk>
-
Nick Piggin authored
The remaining usages for dcache_lock is to allow atomic, multi-step read-side operations over the directory tree by excluding modifications to the tree. Also, to walk in the leaf->root direction in the tree where we don't have a natural d_lock ordering. This could be accomplished by taking every d_lock, but this would mean a huge number of locks and actually gets very tricky. Solve this instead by using the rename seqlock for multi-step read-side operations, retry in case of a rename so we don't walk up the wrong parent. Concurrent dentry insertions are not serialised against. Concurrent deletes are tricky when walking up the directory: our parent might have been deleted when dropping locks so also need to check and retry for that. We can also use the rename lock in cases where livelock is a worry (and it is introduced in subsequent patch). Signed-off-by:
Nick Piggin <npiggin@kernel.dk>
-
Nick Piggin authored
Add a new lock, dcache_inode_lock, to protect the inode's i_dentry list from concurrent modification. d_alias is also protected by d_lock. Signed-off-by:
Nick Piggin <npiggin@kernel.dk>
-
Nick Piggin authored
Make d_count non-atomic and protect it with d_lock. This allows us to ensure a 0 refcount dentry remains 0 without dcache_lock. It is also fairly natural when we start protecting many other dentry members with d_lock. Signed-off-by:
Nick Piggin <npiggin@kernel.dk>
-
Nick Piggin authored
Change d_delete from a dentry deletion notification to a dentry caching advise, more like ->drop_inode. Require it to be constant and idempotent, and not take d_lock. This is how all existing filesystems use the callback anyway. This makes fine grained dentry locking of dput and dentry lru scanning much simpler. Signed-off-by:
Nick Piggin <npiggin@kernel.dk>
-
- 10 Dec, 2010 1 commit
-
-
Chuck Lever authored
After a few unsuccessful NFS mount attempts in which the client and server cannot agree on an authentication flavor both support, the client panics. nfs_umount() is invoked in the kernel in this case. Turns out nfs_umount()'s UMNT RPC invocation causes the RPC client to write off the end of the rpc_clnt's iostat array. This is because the mount client's nrprocs field is initialized with the count of defined procedures (two: MNT and UMNT), rather than the size of the client's proc array (four). The fix is to use the same initialization technique used by most other upper layer clients in the kernel. Introduced by commit 0b524123, which failed to update nrprocs when support was added for UMNT in the kernel. BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=24302 BugLink: http://bugs.launchpad.net/bugs/683938 Reported-by:
Stefan Bader <stefan.bader@canonical.com> Tested-by:
Stefan Bader <stefan.bader@canonical.com> Cc: stable@kernel.org # >= 2.6.32 Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
- 08 Dec, 2010 4 commits
-
-
Trond Myklebust authored
When a nfs_page is freed, nfs_free_request is called which also calls nfs_clear_request to clean out the lock and open contexts and free the pagecache page. However, a couple of places in the nfs code call nfs_clear_request themselves. What happens here if the refcount on the request is still high? We'll be releasing contexts and freeing pointers while the request is possibly still in use. Remove those bare calls to nfs_clear_context. That should only be done when the request is being freed. Note that when doing this, we need to watch out for tests of req->wb_page. Previously, nfs_set_page_tag_locked() and nfs_clear_page_tag_locked() would check the value of req->wb_page to figure out if the page is mapped into the nfsi->nfs_page_tree. We now indicate the page is mapped using the new bit PG_MAPPED in req->wb_flags . Reported-by:
Jeff Layton <jlayton@redhat.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Mi Jinlong authored
When nfs client(kernel) don't support NFSv4, maybe user build kernel without NFSv4, there is a problem. Using command "mount SERVER-IP:/nfsv3 /mnt/" to mount NFSv3 filesystem, mount should should success, but fail and get error: "mount.nfs: an incorrect mount option was specified" System call mount "nfs"(not "nfs4") with "vers=4", if CONFIG_NFS_V4 is not defined, the "vers=4" will be parsed as invalid argument and kernel return EINVAL to nfs-utils. About that, we really want get EPROTONOSUPPORT rather than EINVAL. This path make sure kernel parses argument success, and return EPROTONOSUPPORT at nfs_validate_mount_data(). Signed-off-by:
Mi Jinlong <mijinlong@cn.fujitsu.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Sergey Vlasov authored
The commit 129a84de (locks: fix F_GETLK regression (failure to find conflicts)) fixed the posix_test_lock() function by itself, however, its usage in NFS changed by the commit 9d6a8c5c (locks: give posix_test_lock same interface as ->lock) remained broken - subsequent NFS-specific locking code received F_UNLCK instead of the user-specified lock type. To fix the problem, fl->fl_type needs to be saved before the posix_test_lock() call and restored if no local conflicts were reported. Reference: https://bugzilla.kernel.org/show_bug.cgi?id=23892 Tested-by:
Alexander Morozov <amorozov@etersoft.ru> Signed-off-by:
Sergey Vlasov <vsu@altlinux.ru> Cc: <stable@kernel.org> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Aneesh Kumar K.V authored
An update of mode bits can result in ACL value being changed. We need to mark the acl cache invalid when we update mode. Similarly we need to update file attribute when we change ACL value Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
- 07 Dec, 2010 2 commits
-
-
Trond Myklebust authored
No functional changes, but clarify the code. Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
If we're searching for a specific cookie, and it isn't found in the page cache, we should try an uncached_readdir(). To do so, we return EBADCOOKIE, but we don't set desc->eof. Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
- 02 Dec, 2010 1 commit
-
-
Trond Myklebust authored
We need to ensure that the entries in the nfs_cache_array get cleared when the page is removed from the page cache. To do so, we use the freepage address_space operation. Change nfs_readdir_clear_array to use kmap_atomic(), so that the function can be safely called from all contexts. Finally, modify the cache_page_release helper to call nfs_readdir_clear_array directly, when dealing with an anonymous page from 'uncached_readdir'. Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
- 01 Dec, 2010 1 commit
-
-
Trond Myklebust authored
We need to use the cookie from the previous array entry, not the actual cookie that we are searching for (except for the case of uncached_readdir). Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
- 30 Nov, 2010 1 commit
-
-
Trond Myklebust authored
When comparing filehandles in the helper nfs_same_file(), we should not be using 'strncmp()': filehandles are not null terminated strings. Instead, we should just use the existing helper nfs_compare_fh(). Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 22 Nov, 2010 9 commits
-
-
Trond Myklebust authored
Store the dirent->d_type in the struct nfs_cache_array_entry so that we can use it in getdents() calls. This fixes a regression with the new readdir code. Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
It looks as if the array size calculation in MAX_READDIR_ARRAY does not take the alignment of struct nfs_cache_array_entry into account. Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
We should ignore the errors from the filldir callback, and just interpret them as meaning we should exit, however we should definitely pass back ENOMEM errors. Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
Currently, uncached_readdir() is broken because if fails to handle the results from nfs_readdir_xdr_to_array() correctly. Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
nfs_do_filldir() must always free desc->page when it is done, otherwise we end up leaking the page. Also remove unused variable 'dentry'. Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
Some servers are known to be buggy w.r.t. this. Deal with them... Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
Overflowing the buffer in the readdir ->decode_dirent() should not lead to a fatal error, but rather to an attempt to reread the record in question. Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Arun Bharadwaj authored
When an application opens a file with O_DIRECT flag, if the size of the data that is written is equal to wsize, the client sends a WRITE RPC with stable flag set to UNSTABLE followed by a single COMMIT RPC rather than sending a single WRITE RPC with the stable flag set to FILE_SYNC. This a bug. Patch to fix this. Signed-off-by:
Arun R Bharadwaj <arun@linux.vnet.ibm.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
- 17 Nov, 2010 1 commit
-
-
Arnd Bergmann authored
The big kernel lock has been removed from all these files at some point, leaving only the #include. Remove this too as a cleanup. Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 16 Nov, 2010 5 commits
-
-
Catalin Marinas authored
Strings allocated via kmemdup() in nfs_readdir_make_qstr() are referenced from the nfs_cache_array which is stored in a page cache page. Kmemleak does not scan such pages and it reports several false positives. This patch annotates the string->name pointer so that kmemleak does not consider it a real leak. Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com> Cc: Bryan Schumaker <bjschuma@netapp.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
Fix up the issue that array->eof_index needs to be able to be set even if array->size == 0. Ensure that we catch all important memory allocation error conditions and/or kmap() failures. Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
This reverts commit 80e60639 . This change requires further fixes to ensure that the open doesn't succeed if the lookup later results in a regular file being created. Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Paulius Zaleckas authored
Trying to mount NFS (root partition in my case) fails if CONFIG_NFS_V3 is not selected. nfs_validate_mount_data() returns EPROTONOSUPPORT, because of this check: #ifndef CONFIG_NFS_V3 if (args->version == 3) goto out_v3_not_compiled; #endif /* !CONFIG_NFS_V3 */ and args->version was always initialized to 3. It was working in 2.6.36 Signed-off-by:
Paulius Zaleckas <paulius.zaleckas@gmail.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
- 31 Oct, 2010 2 commits
-
-
Christoph Hellwig authored
The caller allocated it, the caller should free it. The only issue so far is that we could change the flp pointer even on an error return if the fl_change callback failed. But we can simply move the flp assignment after the fl_change invocation, as the callers don't care about the flp return value if the setlease call failed. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
J. Bruce Fields authored
We modified setlease to require the caller to allocate the new lease in the case of creating a new lease, but forgot to fix up the filesystem methods. Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Steve French <sfrench@samba.org> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by:
J. Bruce Fields <bfields@redhat.com> Acked-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 29 Oct, 2010 2 commits
-
-
Al Viro authored
Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
- 28 Oct, 2010 2 commits
-
-
Geert Uytterhoeven authored
On m68k, which is 32-bit: fs/nfs/nfs4proc.c: In function ‘nfs41_sequence_done’: fs/nfs/nfs4proc.c:432: warning: format ‘%ld’ expects type ‘long int’, but argument 3 has type ‘int’ fs/nfs/nfs4proc.c: In function ‘nfs4_setup_sequence’: fs/nfs/nfs4proc.c:576: warning: format ‘%ld’ expects type ‘long int’, but argument 5 has type ‘int’ On 32-bit, ptrdiff_t is int; on 64-bit, ptrdiff_t is long. Introduced by commit dfb4f309 ("NFSv4.1: keep seq_res.sr_slot as pointer rather than an index") Signed-off-by:
Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Dan Carpenter authored
The intent was to test "*desc" for allocation failures, but it tests "desc" which is always a valid pointer here. Signed-off-by:
Dan Carpenter <error27@gmail.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-