block: make blkdev_get/put() handle exclusive access
Tejun Heo authored

Over time, block layer has accumulated a set of APIs dealing with bdev
open, close, claim and release.

* blkdev_get/put() are the primary open and close functions.

* bd_claim/release() deal with exclusive open.

* open/close_bdev_exclusive() are combination of open and claim and
  the other way around, respectively.

* bd_link/unlink_disk_holder() to create and remove holder/slave
  symlinks.

* open_by_devnum() wraps bdget() + blkdev_get().

The interface is a bit confusing and the decoupling of open and claim
makes it impossible to properly guarantee exclusive access as
in-kernel open + claim sequence can disturb the existing exclusive
open even before the block layer knows the current open if for another
exclusive access.  Reorganize the interface such that,

* blkdev_get() is extended to include exclusive access management.
  @holder argument is added and, if is @FMODE_EXCL specified, it will
  gain exclusive access atomically w.r.t. other exclusive accesses.

* blkdev_put() is similarly extended.  It now takes @mode argument and
  if @FMODE_EXCL is set, it releases an exclusive access.  Also, when
  the last exclusive claim is released, the holder/slave symlinks are
  removed automatically.

* bd_claim/release() and close_bdev_exclusive() are no longer
  necessary and either made static or removed.

* bd_link_disk_holder() remains the same but bd_unlink_disk_holder()
  is no longer necessary and removed.

* open_bdev_exclusive() becomes a simple wrapper around lookup_bdev()
  and blkdev_get().  It also has an unexpected extra bdev_read_only()
  test which probably should be moved into blkdev_get().

* open_by_devnum() is modified to take @holder argument and pass it to
  blkdev_get().

Most of bdev open/close operations are unified into blkdev_get/put()
and most exclusive accesses are tested atomically at the open time (as
it should).  This cleans up code and removes some, both valid and
invalid, but unnecessary all the same, corner cases.

open_bdev_exclusive() and open_by_devnum() can use further cleanup -
rename to blkdev_get_by_path() and blkdev_get_by_devt() and drop
special features.  Well, let's leave them for another day.

Most conversions are straight-forward.  drbd conversion is a bit more
involved as there was some reordering, but the logic should stay the
same.
Signed-off-by: default avatarTejun Heo <tj@kernel.org>
Acked-by: default avatarNeil Brown <neilb@suse.de>
Acked-by: default avatarRyusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Acked-by: default avatarMike Snitzer <snitzer@redhat.com>
Acked-by: default avatarPhilipp Reisner <philipp.reisner@linbit.com>
Cc: Peter Osterlund <petero2@telia.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <joel.becker@oracle.com>
Cc: Alex Elder <aelder@sgi.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: dm-devel@redhat.com
Cc: drbd-dev@lists.linbit.com
Cc: Leo Chen <leochen@broadcom.com>
Cc: Scott Branden <sbranden@broadcom.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Dave Kleikamp <shaggy@linux.vnet.ibm.com>
Cc: Joern Engel <joern@logfs.org>
Cc: reiserfs-devel@vger.kernel.org
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
e525fd89
Name Last commit Last update
..
Kconfig Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
Kconfig.debug trivial: improve help text for mm debug config options
Makefile percpu: use percpu allocator on UP too
backing-dev.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
bootmem.c x86, memblock: Replace e820_/_early string with memblock_
bounce.c bounce: call flush_dcache_page() after bounce_copy_vec()
compaction.c mm: compaction: handle active and inactive fairly in too_many_isolated
debug-pagealloc.c generic debug pagealloc
dmapool.c mm: add a might_sleep_if() to dma_pool_alloc()
fadvise.c readahead: introduce FMODE_RANDOM for POSIX_FADV_RANDOM
failslab.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
filemap.c Release page reference during page fault retry
filemap_xip.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
fremap.c Avoid pgoff overflow in remap_file_pages
highmem.c mm,x86: fix kmap_atomic_push vs ioremap_32.c
hugetlb.c mm/hugetlb.c: add missing spin_lock() to hugetlb_cow()
hwpoison-inject.c HWPOISON, hugetlb: support hwpoison injection for hugepage
init-mm.c mm: provide init_mm mm_context initializer
internal.h mm: fix is_mem_section_removable() page_order BUG_ON check
kmemcheck.c kmemcheck: add hooks for the page allocator
kmemleak-test.c percpu: clean up percpu variable definitions
kmemleak.c kmemleak: Fix typo in the comment
ksm.c ksm: fix bad user data when swapping
maccess.c MN10300: Save frame pointer in thread_info struct rather than global var
madvise.c HWPOISON: Add a madvise() injector for soft page offlining
memblock.c
memcontrol.c
memory-failure.c
memory.c
memory_hotplug.c
mempolicy.c
mempool.c
migrate.c
mincore.c
mlock.c
mm_init.c
mmap.c
mmu_context.c
mmu_notifier.c
mmzone.c
mprotect.c
mremap.c
msync.c
nommu.c
oom_kill.c
page-writeback.c
page_alloc.c
page_cgroup.c
page_io.c
page_isolation.c
pagewalk.c
percpu-km.c
percpu-vm.c
percpu.c
prio_tree.c
quicklist.c
readahead.c
rmap.c
shmem.c
slab.c
slob.c
slub.c
sparse-vmemmap.c
sparse.c
swap.c
swap_state.c
swapfile.c
thrash.c
truncate.c
util.c
vmalloc.c
vmscan.c
vmstat.c