- 02 Mar, 2010 1 commit
-
-
Jens Axboe authored
This reverts commit 9f7cdbc3 . It's causing oopses om dm setups, so revert it until we investigate. Reported-by:
Dmitry Torokhov <dmitry.torokhov@gmail.com> Tested-by:
Steven Rostedt <rostedt@goodmis.org> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 28 Feb, 2010 1 commit
-
-
Dmitry Monakhov authored
merge_bvec_fn() returns bvec->bv_len on success. So we have to check against this value. But in case of fs_optimization merge we compare with wrong value. This patch must be included in b428cd6da7e6559aca69aa2e3a526037d3f20403 But accidentally i've forgot to add this in the initial patch. To make things straight let's replace all such checks. In fact this makes code easy to understand. Signed-off-by:
Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 26 Feb, 2010 1 commit
-
-
Martin K. Petersen authored
Except for SCSI no device drivers distinguish between physical and hardware segment limits. Consolidate the two into a single segment limit. Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 28 Jan, 2010 1 commit
-
-
Dmitry Monakhov authored
We have to properly decrease bi_size in order to merge_bvec_fn return right result. Otherwise this result in false merge rejects for two absolutely valid bio_vecs. This may cause significant performance penalty for example fs_block_size == 1k and block device is raid0 with small chunk_size = 8k. Then it is impossible to merge 7-th fs-block in to bio which already has 6 fs-blocks. Cc: <stable@kernel.org> Signed-off-by:
Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 19 Jan, 2010 1 commit
-
-
Thiago Farina authored
fs/bio.c:81:33: warning: symbol 'bslab' shadows an earlier one fs/bio.c:74:25: originally declared here Signed-off-by:
Thiago Farina <tfransosi@gmail.com> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 04 Dec, 2009 1 commit
-
-
André Goddard Rosa authored
That is "success", "unknown", "through", "performance", "[re|un]mapping" , "access", "default", "reasonable", "[con]currently", "temperature" , "channel", "[un]used", "application", "example","hierarchy", "therefore" , "[over|under]flow", "contiguous", "threshold", "enough" and others. Signed-off-by:
André Goddard Rosa <andre.goddard@gmail.com> Signed-off-by:
Jiri Kosina <jkosina@suse.cz>
-
- 26 Nov, 2009 1 commit
-
-
Ilya Loginov authored
Mtdblock driver doesn't call flush_dcache_page for pages in request. So, this causes problems on architectures where the icache doesn't fill from the dcache or with dcache aliases. The patch fixes this. The ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE symbol was introduced to avoid pointless empty cache-thrashing loops on architectures for which flush_dcache_page() is a no-op. Every architecture was provided with this flush pages on architectires where ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE is equal 1 or do nothing otherwise. See "fix mtd_blkdevs problem with caches on some architectures" discussion on LKML for more information. Signed-off-by:
Ilya Loginov <isloginov@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Peter Horton <phorton@bitbox.co.uk> Cc: "Ed L. Cashin" <ecashin@coraid.com> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 02 Nov, 2009 2 commits
-
-
Alberto Bertogli authored
Commit 451a9ebf accidentally broke bio_alloc() and bio_kmalloc() comments by (almost) swapping them. This patch fixes that, by placing the comments in the right place. Signed-off-by:
Alberto Bertogli <albertito@blitiri.com.ar> Acked-by:
Tejun Heo <tj@kernel.org> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
Alberto Bertogli authored
In bio_put()'s comment, add bio_clone() to the list of functions that can give you a bio reference. Signed-off-by:
Alberto Bertogli <albertito@blitiri.com.ar> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 01 Oct, 2009 1 commit
-
-
H Hartley Sweeten authored
As mentioned in Documentation/CodingStyle, move EXPORT* macro's to the line immediately after the closing function brace line. Signed-off-by:
H Hartley Sweeten <hsweeten@visionengravers.com> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 10 Jul, 2009 1 commit
-
-
FUJITA Tomonori authored
I overlooked SG_DXFER_TO_FROM_DEV support when I converted sg to use the block layer mapping API (2.6.28). Douglas Gilbert explained SG_DXFER_TO_FROM_DEV: http://www.spinics.net/lists/linux-scsi/msg37135.html = The semantics of SG_DXFER_TO_FROM_DEV were: - copy user space buffer to kernel (LLD) buffer - do SCSI command which is assumed to be of the DATA_IN (data from device) variety. This would overwrite some or all of the kernel buffer - copy kernel (LLD) buffer back to the user space. The idea was to detect short reads by filling the original user space buffer with some marker bytes ("0xec" it would seem in this report). The "resid" value is a better way of detecting short reads but that was only added this century and requires co-operation from the LLD. = This patch changes the block layer mapping API to support this semantics. This simply adds another field to struct rq_map_data and enables __bio_copy_iov() to copy data from user space even with READ requests. It's better to add the flags field and kills null_mapped and the new from_user fields in struct rq_map_data but that approach makes it difficult to send this patch to stable trees because st and osst drivers use struct rq_map_data (they were converted to use the block layer in 2.6.29 and 2.6.30). Well, I should clean up the block layer mapping API. zhou sf reported this regiression and tested this patch: http://www.spinics.net/lists/linux-scsi/msg37128.html http://www.spinics.net/lists/linux-scsi/msg37168.html Reported-by:
zhou sf <sxzzsf@gmail.com> Tested-by:
zhou sf <sxzzsf@gmail.com> Cc: stable@kernel.org Signed-off-by:
FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 01 Jul, 2009 1 commit
-
-
Martin K. Petersen authored
This patch restores stacking ability to the block layer integrity infrastructure by creating a set of dedicated bip slabs. Each bip slab has an embedded bio_vec array at the end. This cuts down on memory allocations and also simplifies the code compared to the original bvec version. Only the largest bip slab is backed by a mempool. The pool is contained in the bio_set so stacking drivers can ensure forward progress. Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Jens Axboe <axboe@carl.(none)>
-
- 16 Jun, 2009 1 commit
-
-
Li Zefan authored
When porting blktrace to tracepoints, we changed to trace/block.h for trace prober declarations. Signed-off-by:
Li Zefan <lizf@cn.fujitsu.com> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 12 Jun, 2009 1 commit
-
-
Nikanth Karthikesan authored
Fix typo in bio_alloc kernel doc. Signed-off-by:
Nikanth Karthikesan <knikanth@suse.de> Signed-off-by:
Jiri Kosina <jkosina@suse.cz>
-
- 10 Jun, 2009 1 commit
-
-
Michal Simek authored
As reported by sparse: fs/bio.c:720:13: warning: incorrect type in assignment (different address spaces) fs/bio.c:720:13: expected char *iov_addr fs/bio.c:720:13: got void [noderef] <asn:1>* fs/bio.c:724:36: warning: incorrect type in argument 2 (different address spaces) fs/bio.c:724:36: expected void const [noderef] <asn:1>*from fs/bio.c:724:36: got char *iov_addr Signed-off-by:
Michal Simek <monstr@monstr.eu> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 09 Jun, 2009 1 commit
-
-
Li Zefan authored
TRACE_EVENT is a more generic way to define tracepoints. Doing so adds these new capabilities to this tracepoint: - zero-copy and per-cpu splice() tracing - binary tracing without printf overhead - structured logging records exposed under /debug/tracing/events - trace events embedded in function tracer output and other plugins - user-defined, per tracepoint filter expressions ... Cons: - no dev_t info for the output of plug, unplug_timer and unplug_io events. no dev_t info for getrq and sleeprq events if bio == NULL. no dev_t info for rq_abort,...,rq_requeue events if rq->rq_disk == NULL. This is mainly because we can't get the deivce from a request queue. But this may change in the future. - A packet command is converted to a string in TP_assign, not TP_print. While blktrace do the convertion just before output. Since pc requests should be rather rare, this is not a big issue. - In blktrace, an event can have 2 different print formats, but a TRACE_EVENT has a unique format, which means we have some unused data in a trace entry. The overhead is minimized by using __dynamic_array() instead of __array(). I've benchmarked the ioctl blktrace vs the splice based TRACE_EVENT tracing: dd dd + ioctl blktrace dd + TRACE_EVENT (splice) 1 7.36s, 42.7 MB/s 7.50s, 42.0 MB/s 7.41s, 42.5 MB/s 2 7.43s, 42.3 MB/s 7.48s, 42.1 MB/s 7.43s, 42.4 MB/s 3 7.38s, 42.6 MB/s 7.45s, 42.2 MB/s 7.41s, 42.5 MB/s So the overhead of tracing is very small, and no regression when using those trace events vs blktrace. And the binary output of TRACE_EVENT is much smaller than blktrace: # ls -l -h -rw-r--r-- 1 root root 8.8M 06-09 13:24 sda.blktrace.0 -rw-r--r-- 1 root root 195K 06-09 13:24 sda.blktrace.1 -rw-r--r-- 1 root root 2.7M 06-09 13:25 trace_splice.out Following are some comparisons between TRACE_EVENT and blktrace: plug: kjournald-480 [000] 303.084981: block_plug: [kjournald] kjournald-480 [000] 303.084981: 8,0 P N [kjournald] unplug_io: kblockd/0-118 [000] 300.052973: block_unplug_io: [kblockd/0] 1 kblockd/0-118 [000] 300.052974: 8,0 U N [kblockd/0] 1 remap: kjournald-480 [000] 303.085042: block_remap: 8,0 W 102736992 + 8 <- (8,8) 33384 kjournald-480 [000] 303.085043: 8,0 A W 102736992 + 8 <- (8,8) 33384 bio_backmerge: kjournald-480 [000] 303.085086: block_bio_backmerge: 8,0 W 102737032 + 8 [kjournald] kjournald-480 [000] 303.085086: 8,0 M W 102737032 + 8 [kjournald] getrq: kjournald-480 [000] 303.084974: block_getrq: 8,0 W 102736984 + 8 [kjournald] kjournald-480 [000] 303.084975: 8,0 G W 102736984 + 8 [kjournald] bash-2066 [001] 1072.953770: 8,0 G N [bash] bash-2066 [001] 1072.953773: block_getrq: 0,0 N 0 + 0 [bash] rq_complete: konsole-2065 [001] 300.053184: block_rq_complete: 8,0 W () 103669040 + 16 [0] konsole-2065 [001] 300.053191: 8,0 C W 103669040 + 16 [0] ksoftirqd/1-7 [001] 1072.953811: 8,0 C N (5a 00 08 00 00 00 00 00 24 00) [0] ksoftirqd/1-7 [001] 1072.953813: block_rq_complete: 0,0 N (5a 00 08 00 00 00 00 00 24 00) 0 + 0 [0] rq_insert: kjournald-480 [000] 303.084985: block_rq_insert: 8,0 W 0 () 102736984 + 8 [kjournald] kjournald-480 [000] 303.084986: 8,0 I W 102736984 + 8 [kjournald] Changelog from v2 -> v3: - use the newly introduced __dynamic_array(). Changelog from v1 -> v2: - use __string() instead of __array() to minimize the memory required to store hex dump of rq->cmd(). - support large pc requests. - add missing blk_fill_rwbs_rq() in block_rq_requeue TRACE_EVENT. - some cleanups. Signed-off-by:
Li Zefan <lizf@cn.fujitsu.com> LKML-Reference: <4A2DF669.5070905@cn.fujitsu.com> Signed-off-by:
Steven Rostedt <rostedt@goodmis.org>
-
- 22 May, 2009 2 commits
-
-
Martin K. Petersen authored
Convert all external users of queue limits to using wrapper functions instead of poking the request queue variables directly. Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
Martin K. Petersen authored
Until now we have had a 1:1 mapping between storage device physical block size and the logical block sized used when addressing the device. With SATA 4KB drives coming out that will no longer be the case. The sector size will be 4KB but the logical block size will remain 512-bytes. Hence we need to distinguish between the physical block size and the logical ditto. This patch renames hardsect_size to logical_block_size. Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 19 May, 2009 1 commit
-
-
Tejun Heo authored
When a read bio_copy_kern() request fails, the content of the bounce buffer is not copied back. However, as request failure doesn't necessarily mean complete failure, the buffer state can be useful. This behavior is also inconsistent with the user map counterpart and causes the subtle difference between bounced and unbounced IO causes confusion. This patch makes bio_copy_kern_endio() ignore @err and always copy back data on request completion. Signed-off-by:
Tejun Heo <tj@kernel.org> Cc: Boaz Harrosh <bharrosh@panasas.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 28 Apr, 2009 1 commit
-
-
FUJITA Tomonori authored
st driver uses blk_rq_map_user() in order to just build a request out of page frames. In this case, map_data->offset is a non zero value and iov[0].iov_base is NULL. We need to increase nr_pages for that. Cc: stable@kernel.org Signed-off-by:
FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 22 Apr, 2009 2 commits
-
-
Tejun Heo authored
Impact: remove possible deadlock condition There is no reason to use mempool backed allocation for map functions. Also, because kern mapping is used inside LLDs (e.g. for EH), using mempool backed allocation can lead to deadlock under extreme conditions (mempool already consumed by the time a request reached EH and requests are blocked on EH). Switch copy/map functions to bio_kmalloc(). Signed-off-by:
Tejun Heo <tj@kernel.org> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
Tejun Heo authored
Impact: fix bio_kmalloc() and its destruction path bio_kmalloc() was broken in two ways. * bvec_alloc_bs() first allocates bvec using kmalloc() and then ignores it and allocates again like non-kmalloc bvecs. * bio_kmalloc_destructor() didn't check for and free bio integrity data. This patch fixes the above problems. kmalloc patch is separated out from bio_alloc_bioset() and allocates the requested number of bvecs as inline bvecs. * bio_alloc_bioset() no longer takes NULL @bs. None other than bio_kmalloc() used it and outside users can't know how it was allocated anyway. * Define and use BIO_POOL_NONE so that pool index check in bvec_free_bs() triggers if inline or kmalloc allocated bvec gets there. * Relocate destructors on top of each allocation function so that how they're used is more clear. Jens Axboe suggested allocating bvecs inline. Signed-off-by:
Tejun Heo <tj@kernel.org> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 15 Apr, 2009 1 commit
-
-
Jens Axboe authored
Explain that with __GFP_WAIT set it will not fail, and that the caller must never allocate more than 1 bio at the time. Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 30 Mar, 2009 1 commit
-
-
Alberto Bertogli authored
Signed-off-by:
Alberto Bertogli <albertito@blitiri.com.ar> Signed-off-by:
Jiri Kosina <jkosina@suse.cz>
-
- 24 Mar, 2009 3 commits
-
-
Martin K. Petersen authored
The integrity bio allocation needs its own bio_set to avoid violating the mempool allocation rules and risking deadlocks. Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
Jens Axboe authored
If we don't have CONFIG_BLK_DEV_INTEGRITY set, then we don't have any external dependencies on the bio_vec slabs. So don't create the ones that we will inline anyway. Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
Ingo Molnar authored
this warning (which got fixed by commit b2bf9683 ): fs/bio.c: In function ‘bio_alloc_bioset’: fs/bio.c:305: warning: ‘p’ may be used uninitialized in this function Triggered because the code flow in bio_alloc_bioset() is correct but a bit complex for the compiler to see through. Streamline it a bit - this also makes the code a tiny bit more compact: text data bss dec hex filename 7540 256 40 7836 1e9c bio.o.before 7539 256 40 7835 1e9b bio.o.after Also remove an older compiler-warnings annotation from this function, it's not needed. Signed-off-by:
Ingo Molnar <mingo@elte.hu> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 14 Mar, 2009 2 commits
-
-
Li Zefan authored
If bio_integrity_clone() fails, bio_clone() returns NULL without freeing the newly allocated bio. Signed-off-by:
Li Zefan <lizf@cn.fujitsu.com> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
un'ichi Nomura authored
Stricter gfp_mask might be required for clone allocation. For example, request-based dm may clone bio in interrupt context so it has to use GFP_ATOMIC. Signed-off-by:
Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by:
Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Acked-by:
Martin K. Petersen <martin.petersen@oracle.com> Cc: Alasdair G Kergon <agk@redhat.com> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 26 Feb, 2009 1 commit
-
-
Jens Axboe authored
Newer gcc throw this warning: fs/bio.c: In function ?bio_alloc_bioset?: fs/bio.c:305: warning: ?p? may be used uninitialized in this function since it cannot figure out that 'p' is only ever used if 'bs' is non-NULL. Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 18 Feb, 2009 1 commit
-
-
Subhash Peddamallu authored
When freeing from bio pool use right ptr to account for bs->front_pad, instead of bio ptr, Signed-off-by:
Subhash Peddamallu <subhash.peddamallu@gmail.com> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 02 Jan, 2009 3 commits
-
-
FUJITA Tomonori authored
The commit 81882766 (block: make blk_rq_map_user take a NULL user-space buffer) extended blk_rq_map_user to accept a NULL user-space buffer with a READ command. It was necessary to convert sg to use the block layer mapping API. This patch extends blk_rq_map_user again for a WRITE command. It is necessary to convert st and osst drivers to use the block layer apping API. Signed-off-by:
FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by:
Jens Axboe <jens.axboe@oracle.com> Signed-off-by:
James Bottomley <James.Bottomley@HansenPartnership.com>
-
FUJITA Tomonori authored
This fixes bio_copy_user_iov to properly handle the partial mappings with struct rq_map_data (which only sg uses for now but st and osst will shortly). It adds the offset member to struct rq_map_data and changes blk_rq_map_user to update it so that bio_copy_user_iov can add an appropriate page frame via bio_add_pc_page(). Signed-off-by:
FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by:
Jens Axboe <jens.axboe@oracle.com> Signed-off-by:
James Bottomley <James.Bottomley@HansenPartnership.com>
-
FUJITA Tomonori authored
This fixes bio_add_page misuse in bio_copy_user_iov with rq_map_data, which only sg uses now. rq_map_data carries page frames for bio_add_pc_page. bio_copy_user_iov uses bio_add_pc_page with a larger size than PAGE_SIZE. It's clearly wrong. Signed-off-by:
FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Acked-by:
Jens Axboe <jens.axboe@oracle.com> Signed-off-by:
James Bottomley <James.Bottomley@HansenPartnership.com>
-
- 29 Dec, 2008 5 commits
-
-
Jens Axboe authored
We don't need to clear the memory used for adding bio_vec entries, since nobody should be looking at members unitialized. Any valid use should be below bio->bi_vcnt, and that members up until that count must be valid since they were added through bio_add_page(). Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
Jens Axboe authored
When we go and allocate a bio for IO, we actually do two allocations. One for the bio itself, and one for the bi_io_vec that holds the actual pages we are interested in. This feature inlines a definable amount of io vecs inside the bio itself, so we eliminate the bio_vec array allocation for IO's up to a certain size. It defaults to 4 vecs, which is typically 16k of IO. Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
Jens Axboe authored
Instead of having a global bio slab cache, add a reference to one in each bio_set that is created. This allows for personalized slabs in each bio_set, so that they can have bios of different sizes. This means we can personalize the bios we return. File systems may want to embed the bio inside another structure, to avoid allocation more items (and stuffing them in ->bi_private) after the get a bio. Or we may want to embed a number of bio_vecs directly at the end of a bio, to avoid doing two allocations to return a bio. This is now possible. Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
Jens Axboe authored
In preparation for adding differently sized bios. Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
Jens Axboe authored
We only very rarely need the mempool backing, so it makes sense to get rid of all but one of the mempool in a bio_set. So keep the largest bio_vec count mempool so we can always honor the largest allocation, and "upgrade" callers that fail. Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 26 Nov, 2008 1 commit
-
-
Ingo Molnar authored
Port to the new tracepoints API: split DEFINE_TRACE() and DECLARE_TRACE() sites. Spread them out to the usage sites, as suggested by Mathieu Desnoyers. Signed-off-by:
Ingo Molnar <mingo@elte.hu> Acked-by:
Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
-