- 22 Nov, 2010 7 commits
-
-
Trond Myklebust authored
We should ignore the errors from the filldir callback, and just interpret them as meaning we should exit, however we should definitely pass back ENOMEM errors. Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
Currently, uncached_readdir() is broken because if fails to handle the results from nfs_readdir_xdr_to_array() correctly. Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
nfs_do_filldir() must always free desc->page when it is done, otherwise we end up leaking the page. Also remove unused variable 'dentry'. Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
Some servers are known to be buggy w.r.t. this. Deal with them... Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
Overflowing the buffer in the readdir ->decode_dirent() should not lead to a fatal error, but rather to an attempt to reread the record in question. Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Arun Bharadwaj authored
When an application opens a file with O_DIRECT flag, if the size of the data that is written is equal to wsize, the client sends a WRITE RPC with stable flag set to UNSTABLE followed by a single COMMIT RPC rather than sending a single WRITE RPC with the stable flag set to FILE_SYNC. This a bug. Patch to fix this. Signed-off-by:
Arun R Bharadwaj <arun@linux.vnet.ibm.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
- 20 Nov, 2010 2 commits
-
-
Lukas Czerner authored
Filesystem independent ioctl was rejected as not common enough to be in core vfs ioctl. Since we still need to access to this functionality this commit adds ext4 specific ioctl EXT4_IOC_TRIM to dispatch ext4_trim_fs(). It takes fstrim_range structure as an argument. fstrim_range is definec in the include/linux/fs.h and its definition is as follows. struct fstrim_range { __u64 start; __u64 len; __u64 minlen; } start - first Byte to trim len - number of Bytes to trim from start minlen - minimum extent length to trim, free extents shorter than this number of Bytes will be ignored. This will be rounded up to fs block size. After the FITRIM is done, the number of actually discarded Bytes is stored in fstrim_range.len to give the user better insight on how much storage space has been really released for wear-leveling. Signed-off-by:
Lukas Czerner <lczerner@redhat.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
Lukas Czerner authored
There was concern that FITRIM ioctl is not common enough to be included in core vfs ioctl, as Christoph Hellwig pointed out there's no real point in dispatching this out to a separate vector instead of just through ->ioctl. So this commit removes ioctl_fstrim() from vfs ioctl and trim_fs from super_operation structure. Signed-off-by:
Lukas Czerner <lczerner@redhat.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
- 19 Nov, 2010 1 commit
-
-
Darrick J. Wong authored
At the start of ext4_fill_super, ret is set to -EINVAL, and any failure path out of that function returns ret. However, the generic_check_addressable clause sets ret = 0 (if it passes), which means that a subsequent failure (e.g. a group checksum error) returns 0 even though the mount should fail. This causes vfs_kern_mount in turn to think that the mount succeeded, leading to an oops. A simple fix is to avoid using ret for the generic_check_addressable check, which was last changed in commit 30ca22c7 . Signed-off-by:
Darrick J. Wong <djwong@us.ibm.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
- 18 Nov, 2010 4 commits
-
-
Sage Weil authored
One of the readdir filldir_t callers was passing the raw ceph 64-bit ino instead of the hashed 32-bit one, producing an EOVERFLOW in the filler callback. Fix this by calling the ceph_vino_to_ino() helper to do the conversion. Reported-by:
Jan Smets <jan.smets@alcatel-lucent.com> Tested-by:
Jan Smets <jan.smets@alcatel-lucent.com> Signed-off-by:
Sage Weil <sage@newdream.net>
-
yangsheng authored
In jbd2_journal_init_dev(), we need make sure the journal structure is fully initialzied before calling jbd2_stats_proc_init(). Reviewed-by:
Andreas Dilger <andreas.dilger@oracle.com> Signed-off-by:
yangsheng <sheng.yang@oracle.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
Dan Carpenter authored
If the the li_request_list was empty then it returned with the lock held. Instead of adding a "goto unlock" I just removed that special case and let it go past the empty list_for_each_safe(). Signed-off-by:
Dan Carpenter <error27@gmail.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
Markus Trippelsdorf authored
ext4_end_bio calls put_page and kmem_cache_free before calling SetPageUpdate(). This can result in setting the PageUptodate bit on random pages and causes the following BUG: BUG: Bad page state in process rm pfn:52e54 page:ffffea0001222260 count:0 mapcount:0 mapping: (null) index:0x0 arch kernel: page flags: 0x4000000000000008(uptodate) Fix the problem by moving put_io_page() after the SetPageUpdate() call. Thanks to Hugh Dickins for analyzing this problem. Reported-by:
Markus Trippelsdorf <markus@trippelsdorf.de> Tested-by:
Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by:
Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu>
-
- 17 Nov, 2010 2 commits
-
-
Arnd Bergmann authored
Lock_kernel is gone from the code, so the comments should be updated, too. nfsd now uses lock_flocks instead of lock_kernel to protect against posix file locks. Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Acked-by:
J. Bruce Fields <bfields@redhat.com> Cc: linux-nfs@vger.kernel.org Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Arnd Bergmann authored
The big kernel lock has been removed from all these files at some point, leaving only the #include. Remove this too as a cleanup. Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 16 Nov, 2010 6 commits
-
-
Catalin Marinas authored
Strings allocated via kmemdup() in nfs_readdir_make_qstr() are referenced from the nfs_cache_array which is stored in a page cache page. Kmemleak does not scan such pages and it reports several false positives. This patch annotates the string->name pointer so that kmemleak does not consider it a real leak. Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com> Cc: Bryan Schumaker <bjschuma@netapp.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
Fix up the issue that array->eof_index needs to be able to be set even if array->size == 0. Ensure that we catch all important memory allocation error conditions and/or kmap() failures. Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
This reverts commit 80e60639 . This change requires further fixes to ensure that the open doesn't succeed if the lookup later results in a regular file being created. Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Paulius Zaleckas authored
Trying to mount NFS (root partition in my case) fails if CONFIG_NFS_V3 is not selected. nfs_validate_mount_data() returns EPROTONOSUPPORT, because of this check: #ifndef CONFIG_NFS_V3 if (args->version == 3) goto out_v3_not_compiled; #endif /* !CONFIG_NFS_V3 */ and args->version was always initialized to 3. It was working in 2.6.36 Signed-off-by:
Paulius Zaleckas <paulius.zaleckas@gmail.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
Trond Myklebust authored
Nick Bowler reports: There are no unusual messages on the client... but I just logged into the server and I see lots of messages of the following form: nfsd: request from insecure port (192.168.8.199:35766)! nfsd: request from insecure port (192.168.8.199:35766)! nfsd: request from insecure port (192.168.8.199:35766)! nfsd: request from insecure port (192.168.8.199:35766)! nfsd: request from insecure port (192.168.8.199:35766)! Bisected to commit 92476850 (SUNRPC: Properly initialize sock_xprt.srcaddr in all cases) Apparently, removing the 'transport->srcaddr.ss_family = family' from xs_create_sock() triggers this due to nlmclnt_lookup_host() incorrectly initialising the srcaddr family to AF_UNSPEC. Reported-by:
Nick Bowler <nbowler@elliptictech.com> Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com>
-
- 15 Nov, 2010 1 commit
-
-
Steven Whitehouse authored
This area of the code has always been a bit delicate due to the subtleties of lock ordering. The problem is that for "normal" alloc/dealloc, we always grab the inode locks first and the rgrp lock later. In order to ensure no races in looking up the unlinked, but still allocated inodes, we need to hold the rgrp lock when we do the lookup, which means that we can't take the inode glock. The solution is to borrow the technique already used by NFS to solve what is essentially the same problem (given an inode number, look up the inode carefully, checking that it really is in the expected state). We cannot do that directly from the allocation code (lock ordering again) so we give the job to the pre-existing delete workqueue and carry on with the allocation as normal. If we find there is no space, we do a journal flush (required anyway if space from a deallocation is to be released) which should block against the pending deallocations, so we should always get the space back. Signed-off-by:
Steven Whitehouse <swhiteho@redhat.com>
-
- 13 Nov, 2010 1 commit
-
-
Tao Ma authored
Commit 83fd9c7f changes l_level, l_requested and l_blocking of ocfs2_lock_res from int to unsigned char. But actually it is initially as -1(ocfs2_lock_res_init_common) which correspoding to 255 for unsigned char. So the whole dlm lock mechanism doesn't work now which means a disaster to ocfs2. Cc: Goldwyn Rodrigues <rgoldwyn@suse.de> Signed-off-by:
Tao Ma <tao.ma@oracle.com> Signed-off-by:
Joel Becker <joel.becker@oracle.com>
-
- 12 Nov, 2010 2 commits
-
-
Dave Jones authored
WARN_ONCE is a bit strong for a deprecation warning, given that it spews a huge backtrace. Signed-off-by:
Dave Jones <davej@redhat.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Sage Weil authored
We start at offset 2 for the leftmost frag, and 0 for subsequent frags. When we reach the end (rightmost), we go back to 2. This fixes readdir on fragmented (large) directories. Signed-off-by:
Sage Weil <sage@newdream.net>
-
- 11 Nov, 2010 1 commit
-
-
Sage Weil authored
Clear fi->last_name when it's freed. The only caller is rewinddir() (or equivalent lseek). Signed-off-by:
Sage Weil <sage@newdream.net>
-
- 10 Nov, 2010 13 commits
-
-
Christoph Hellwig authored
In commit 20cb52eb , titled "xfs: simplify xfs_vm_writepage" I added an assert that any !mapped and uptodate buffers are not dirty. That asserts turns out to trigger a lot when running fsx on filesystems with small block sizes. The reason for that is that the assert is simply incorrect. !mapped and uptodate just mean this buffer covers a hole, and whenever we do a set_page_dirty we mark all blocks in the page dirty, no matter if they have data or not. So remove the assert, and update the comment above the condition to match reality. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Alex Elder <aelder@sgi.com>
-
J. Bruce Fields authored
A minor oversight from f7347ce4 , "fasync: re-organize fasync entry insertion to allow it under a spinlock": this cleanup-on-error was only needed to handle -ENOMEM. Now that we're preallocating it's unneeded. Signed-off-by:
J. Bruce Fields <bfields@redhat.com>
-
J. Bruce Fields authored
We must also free the passed-in lease in the case it wasn't used because an existing lease was upgrade/downgraded or already existed. Note the nfsd caller doesn't care because it's fl_change callback returns an error in those cases. Signed-off-by:
J. Bruce Fields <bfields@redhat.com>
-
Christoph Hellwig authored
XFS does not need it's inodes to actuall be hashed in the VFS inode cache, but we require the inode to be marked hashed for the writeback code to work. Insted of using insert_inode_hash, which requires a second inode_lock roundtrip after the partial merge of the inode scalability patches in 2.6.37-rc simply use the new hlist_add_fake helper to mark it hashed without requiring a lock or touching a global cache line. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Alex Elder <aelder@sgi.com>
-
Christoph Hellwig authored
Andi Kleen reported that gcc-4.5 gives lots of warnings for him inside the XFS code. It turned out most of them are due to the quota stubs beeing macros, and gcc now complaining about macros evaluating to 0 that are not assigned to variables. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Alex Elder <aelder@sgi.com>
-
Christoph Hellwig authored
The filestreams code may take the iolock on the parent inode while holding it on a child. This is the only place in XFS where we take both the child and parent iolock, so just telling lockdep about it is enough. The lock flag required for that was already added as part of the ilock lockdep annotations and unused so far. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Alex Elder <aelder@sgi.com>
-
Dave Chinner authored
The delayed write buffer split trace currently issues a trace for every buffer it scans. These buffers are not necessarily queued for delayed write. Indeed, when buffers are pinned, there can be thousands of traces of buffers that aren't actually queued for delayed write and the ones that are are lost in the noise. Move the trace point to record only buffers that are split out for IO to be issued on. Signed-off-by:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Alex Elder <aelder@sgi.com>
-
Dave Chinner authored
The walk fails to decrement the per-ag reference count when the non-blocking walk fails to obtain the per-ag reclaim lock, leading to an assert failure on debug kernels when unmounting a filesystem. Signed-off-by:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Alex Elder <aelder@sgi.com>
-
Kulikov Vasiliy authored
al_hreq is copied from userland. If al_hreq.buflen is not properly aligned then xfs_attr_list will ignore the last bytes of kbuf. These bytes are unitialized. It leads to leaking of contents of kernel stack memory. Signed-off-by:
Vasiliy Kulikov <segooon@gmail.com> Signed-off-by:
Alex Elder <aelder@sgi.com>
-
Christoph Hellwig authored
We promised to do this for 2.6.37, and the code looks stable enough to keep that promise. Signed-off-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Dave Chinner <dchinner@redhat.com> Signed-off-by:
Alex Elder <aelder@sgi.com>
-
Sergey Senozhatsky authored
Commit 4221a991 "Add RCU check for find_task_by_vpid()" introduced rcu_lockdep_assert to find_task_by_pid_ns= Assertion failed in sys_ioprio_get. The patch is fixing assertion failure in ioprio_set as well. kernel/pid.c:419 invoked rcu_dereference_check() without protection! stack backtrace: Pid: 4254, comm: iotop Not tainted Call Trace: [<ffffffff810656f2>] lockdep_rcu_dereference+0xaa/0xb2 [<ffffffff81053c67>] find_task_by_pid_ns+0x4f/0x68 [<ffffffff81053c9d>] find_task_by_vpid+0x1d/0x1f [<ffffffff811104e2>] sys_ioprio_get+0x50/0x2da [<ffffffff81002182>] system_call_fastpath+0x16/0x1b V2: rcu critical section expanded according to comment by Paul E. McKenney Signed-off-by:
Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by:
Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by:
Jens Axboe <jaxboe@fusionio.com>
-
Daniel J Blueman authored
With 2.6.37-rc1, I observe sys_ioprio_set not taking the RCU lock [1] across access to the task credentials. Inspecting the code in fs/ioprio.c, the tasklist_lock is held for read across the __task_cred call, which is presumably sufficient to prevent the task credentials becoming stale. =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- kernel/pid.c:419 invoked rcu_dereference_check() without protection! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 1 1 lock held by start-stop-daem/2246: #0: (tasklist_lock){.?.?..}, at: [<ffffffff811a2dfa>] sys_ioprio_set+0x8a/0x400 stack backtrace: Pid: 2246, comm: start-stop-daem Not tainted 2.6.37-rc1-330cd+ #2 Call Trace: [<ffffffff8109f5f4>] lockdep_rcu_dereference+0xa4/0xc0 [<ffffffff81085651>] find_task_by_pid_ns+0x81/0x90 [<ffffffff8108567d>] find_task_by_vpid+0x1d/0x20 [<ffffffff811a3160>] sys_ioprio_set+0x3f0/0x400 [<ffffffff816efa79>] ? trace_hardirqs_on_thunk+0x3a/0x3f [<ffffffff81003482>] system_call_fastpath+0x16/0x1b Take the RCU lock for read across acquiring the pointer to the task credentials and dereferencing it. Signed-off-by:
Daniel J Blueman <daniel.blueman@gmail.com> Fixed up by Jens to fix missing rcu_read_unlock() on mismatches. Signed-off-by:
Jens Axboe <jaxboe@fusionio.com>
-
Jens Axboe authored
If the iovec is being set up in a way that causes uaddr + PAGE_SIZE to overflow, we could end up attempting to map a huge number of pages. Check for this invalid input type. Reported-by:
Dan Rosenberg <drosenberg@vsecurity.com> Cc: stable@kernel.org Signed-off-by:
Jens Axboe <jaxboe@fusionio.com>
-