1. 15 Dec, 2009 2 commits
    • KOSAKI Motohiro's avatar
      vmscan: stop kswapd waiting on congestion when the min watermark is not being met · bb3ab596
      KOSAKI Motohiro authored
      
      If reclaim fails to make sufficient progress, the priority is raised.
      Once the priority is higher, kswapd starts waiting on congestion.
      However, if the zone is below the min watermark then kswapd needs to
      continue working without delay as there is a danger of an increased rate
      of GFP_ATOMIC allocation failure.
      
      This patch changes the conditions under which kswapd waits on congestion
      by only going to sleep if the min watermarks are being met.
      
      [mel@csn.ul.ie: add stats to track how relevant the logic is]
      [mel@csn.ul.ie: make kswapd only check its own zones and rename the relevant counters]
      Signed-off-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: default avatarMel Gorman <mel@csn.ul.ie>
      Reviewed-by: default avatarRik van Riel <riel@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bb3ab596
    • Mel Gorman's avatar
      vmscan: have kswapd sleep for a short interval and double check it should be asleep · f50de2d3
      Mel Gorman authored
      
      After kswapd balances all zones in a pgdat, it goes to sleep.  In the
      event of no IO congestion, kswapd can go to sleep very shortly after the
      high watermark was reached.  If there are a constant stream of allocations
      from parallel processes, it can mean that kswapd went to sleep too quickly
      and the high watermark is not being maintained for sufficient length time.
      
      This patch makes kswapd go to sleep as a two-stage process.  It first
      tries to sleep for HZ/10.  If it is woken up by another process or the
      high watermark is no longer met, it's considered a premature sleep and
      kswapd continues work.  Otherwise it goes fully to sleep.
      
      This adds more counters to distinguish between fast and slow breaches of
      watermarks.  A "fast" premature sleep is one where the low watermark was
      hit in a very short time after kswapd going to sleep.  A "slow" premature
      sleep indicates that the high watermark was breached after a very short
      interval.
      Signed-off-by: default avatarMel Gorman <mel@csn.ul.ie>
      Cc: Frans Pop <elendil@planet.nl>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f50de2d3
  2. 29 Oct, 2009 1 commit
    • Tejun Heo's avatar
      percpu: make percpu symbols under kernel/ and mm/ unique · 1871e52c
      Tejun Heo authored
      
      This patch updates percpu related symbols under kernel/ and mm/ such
      that percpu symbols are unique and don't clash with local symbols.
      This serves two purposes of decreasing the possibility of global
      percpu symbol collision and allowing dropping per_cpu__ prefix from
      percpu symbols.
      
      * kernel/lockdep.c: s/lock_stats/cpu_lock_stats/
      
      * kernel/sched.c: s/init_rq_rt/init_rt_rq_var/	(any better idea?)
        		  s/sched_group_cpus/sched_groups/
      
      * kernel/softirq.c: s/ksoftirqd/run_ksoftirqd/a
      
      * kernel/softlockup.c: s/(*)_timestamp/softlockup_\1_ts/
        		       s/watchdog_task/softlockup_watchdog/
      		       s/timestamp/ts/ for local variables
      
      * kernel/time/timer_stats: s/lookup_lock/tstats_lookup_lock/
      
      * mm/slab.c: s/reap_work/slab_reap_work/
        	     s/reap_node/slab_reap_node/
      
      * mm/vmstat.c: local variable changed to avoid collision with vmstat_work
      
      Partly based on Rusty Russell's "alloc_percpu: rename percpu vars
      which cause name clashes" patch.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatar(slab/vmstat) Christoph Lameter <cl@linux-foundation.org>
      Reviewed-by: default avatarChristoph Lameter <cl@linux-foundation.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Nick Piggin <npiggin@suse.de>
      1871e52c
  3. 22 Sep, 2009 3 commits
  4. 17 Jun, 2009 5 commits
  5. 18 May, 2009 1 commit
    • Mel Gorman's avatar
      [ARM] Double check memmap is actually valid with a memmap has unexpected holes V2 · eb33575c
      Mel Gorman authored
      
      pfn_valid() is meant to be able to tell if a given PFN has valid memmap
      associated with it or not. In FLATMEM, it is expected that holes always
      have valid memmap as long as there is valid PFNs either side of the hole.
      In SPARSEMEM, it is assumed that a valid section has a memmap for the
      entire section.
      
      However, ARM and maybe other embedded architectures in the future free
      memmap backing holes to save memory on the assumption the memmap is never
      used. The page_zone linkages are then broken even though pfn_valid()
      returns true. A walker of the full memmap must then do this additional
      check to ensure the memmap they are looking at is sane by making sure the
      zone and PFN linkages are still valid. This is expensive, but walkers of
      the full memmap are extremely rare.
      
      This was caught before for FLATMEM and hacked around but it hits again for
      SPARSEMEM because the page_zone linkages can look ok where the PFN linkages
      are totally screwed. This looks like a hatchet job but the reality is that
      any clean solution would end up consumning all the memory saved by punching
      these unexpected holes in the memmap. For example, we tried marking the
      memmap within the section invalid but the section size exceeds the size of
      the hole in most cases so pfn_valid() starts returning false where valid
      memmap exists. Shrinking the size of the section would increase memory
      consumption offsetting the gains.
      
      This patch identifies when an architecture is punching unexpected holes
      in the memmap that the memory model cannot automatically detect and sets
      ARCH_HAS_HOLES_MEMORYMODEL. At the moment, this is restricted to EP93xx
      which is the model sub-architecture this has been reported on but may expand
      later. When set, walkers of the full memmap must call memmap_valid_within()
      for each PFN and passing in what it expects the page and zone to be for
      that PFN. If it finds the linkages to be broken, it assumes the memmap is
      invalid for that PFN.
      Signed-off-by: default avatarMel Gorman <mel@csn.ul.ie>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      eb33575c
  6. 03 Apr, 2009 1 commit
  7. 01 Apr, 2009 1 commit
  8. 30 Mar, 2009 1 commit
  9. 31 Dec, 2008 1 commit
  10. 23 Oct, 2008 4 commits
  11. 20 Oct, 2008 7 commits
  12. 27 Aug, 2008 1 commit
    • Mel Gorman's avatar
      [ARM] Skip memory holes in FLATMEM when reading /proc/pagetypeinfo · e80d6a24
      Mel Gorman authored
      
      Ordinarily, memory holes in flatmem still have a valid memmap and is safe
      to use. However, an architecture (ARM) frees up the memmap backing memory
      holes on the assumption it is never used. /proc/pagetypeinfo reads the
      whole range of pages in a zone believing that the memmap is valid and that
      pfn_valid will return false if it is not. On ARM, freeing the memmap breaks
      the page->zone linkages even though pfn_valid() returns true and the kernel
      can oops shortly afterwards due to accessing a bogus struct zone *.
      
      This patch lets architectures say when FLATMEM can have holes in the
      memmap. Rather than an expensive check for valid memory, /proc/pagetypeinfo
      will confirm that the page linkages are still valid by checking page->zone
      is still the expected zone. The lookup of page_zone is safe as there is a
      limited range of memory that is accessed when calling page_zone.  Even if
      page_zone happens to return the correct zone, the impact is that the counters
      in /proc/pagetypeinfo are slightly off but fragmentation monitoring is
      unlikely to be relevant on an embedded system.
      Reported-by: default avatarH Hartley Sweeten <hsweeten@visionengravers.com>
      Signed-off-by: default avatarMel Gorman <mel@csn.ul.ie>
      Tested-by: default avatarH Hartley Sweeten <hsweeten@visionengravers.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      e80d6a24
  13. 24 Jul, 2008 1 commit
  14. 23 May, 2008 1 commit
  15. 13 May, 2008 1 commit
  16. 30 Apr, 2008 2 commits
  17. 28 Apr, 2008 3 commits
  18. 16 Apr, 2008 1 commit
    • KOSAKI Motohiro's avatar
      add "Isolate" migratetype name to /proc/pagetypeinfo · 91446b06
      KOSAKI Motohiro authored
      In a5d76b54
      
       (memory unplug: page isolation by
      KAMEZAWA Hiroyuki), "isolate" migratetype added.  but unfortunately, it
      doesn't treat /proc/pagetypeinfo display logic.
      
      this patch add "Isolate" to pagetype name field.
      
      /proc/pagetype
      before:
      ------------------------------------------------------------------------------------------------------------------------
      Free pages count per migrate type at order       0      1      2      3      4      5      6      7      8      9     10
      Node    0, zone      DMA, type    Unmovable      1      2      2      2      1      2      2      1      1      0      0
      Node    0, zone      DMA, type  Reclaimable      0      0      0      0      0      0      0      0      0      0      0
      Node    0, zone      DMA, type      Movable      2      3      3      1      3      3      2      0      0      0      0
      Node    0, zone      DMA, type      Reserve      0      0      0      0      0      0      0      0      0      0      1
      Node    0, zone      DMA, type       <NULL>      0      0      0      0      0      0      0      0      0      0      0
      Node    0, zone   Normal, type    Unmovable      1      9      7      4      1      1      1      1      0      0      0
      Node    0, zone   Normal, type  Reclaimable      5      2      0      0      1      1      0      0      0      1      0
      Node    0, zone   Normal, type      Movable      0      1      1      0      0      0      1      0      0      1     60
      Node    0, zone   Normal, type      Reserve      0      0      0      0      0      0      0      0      0      0      1
      Node    0, zone   Normal, type       <NULL>      0      0      0      0      0      0      0      0      0      0      0
      Node    0, zone  HighMem, type    Unmovable      0      0      1      1      1      0      1      1      2      2      0
      Node    0, zone  HighMem, type  Reclaimable      0      0      0      0      0      0      0      0      0      0      0
      Node    0, zone  HighMem, type      Movable    236     62      6      2      2      1      1      0      1      1     16
      Node    0, zone  HighMem, type      Reserve      0      0      0      0      0      0      0      0      0      0      1
      Node    0, zone  HighMem, type       <NULL>      0      0      0      0      0      0      0      0      0      0      0
      
      Number of blocks type     Unmovable  Reclaimable      Movable      Reserve       <NULL>
      Node 0, zone      DMA            1            0            2       1            0
      Node 0, zone   Normal           10           40          169       1            0
      Node 0, zone  HighMem            2            0          283       1            0
      
      after:
      ------------------------------------------------------------------------------------------------------------------------
      Free pages count per migrate type at order       0      1      2      3      4      5      6      7      8      9     10
      Node    0, zone      DMA, type    Unmovable      1      2      2      2      1      2      2      1      1      0      0
      Node    0, zone      DMA, type  Reclaimable      0      0      0      0      0      0      0      0      0      0      0
      Node    0, zone      DMA, type      Movable      2      3      3      1      3      3      2      0      0      0      0
      Node    0, zone      DMA, type      Reserve      0      0      0      0      0      0      0      0      0      0      1
      Node    0, zone      DMA, type      Isolate      0      0      0      0      0      0      0      0      0      0      0
      Node    0, zone   Normal, type    Unmovable      0      2      1      1      0      1      0      0      0      0      0
      Node    0, zone   Normal, type  Reclaimable      1      1      1      1      1      0      1      1      1      0      0
      Node    0, zone   Normal, type      Movable      0      1      1      1      0      1      0      1      0      0    196
      Node    0, zone   Normal, type      Reserve      0      0      0      0      0      0      0      0      0      0      1
      Node    0, zone   Normal, type      Isolate      0      0      0      0      0      0      0      0      0      0      0
      Node    0, zone  HighMem, type    Unmovable      0      1      0      0      0      1      1      1      2      2      0
      Node    0, zone  HighMem, type  Reclaimable      0      0      0      0      0      0      0      0      0      0      0
      Node    0, zone  HighMem, type      Movable      1      0      1      1      0      0      0      0      1      0    200
      Node    0, zone  HighMem, type      Reserve      0      0      0      0      0      0      0      0      0      0      1
      Node    0, zone  HighMem, type      Isolate      0      0      0      0      0      0      0      0      0      0      0
      
      Number of blocks type     Unmovable  Reclaimable      Movable      Reserve      Isolate
      Node 0, zone      DMA            1            0            2       1            0
      Node 0, zone   Normal            8            4          207       1            0
      Node 0, zone  HighMem            2            0          283       1            0
      Signed-off-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Acked-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: default avatarMel Gorman <mel@csn.ul.ie>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      91446b06
  19. 05 Feb, 2008 3 commits