1. 17 Sep, 2009 1 commit
    • Peter Zijlstra's avatar
      sched: Add new wakeup preemption mode: WAKEUP_RUNNING · ad4b78bb
      Peter Zijlstra authored
      
      Create a new wakeup preemption mode, preempt towards tasks that run
      shorter on avg. It sets next buddy to be sure we actually run the task
      we preempted for.
      
      Test results:
      
       root@twins:~# while :; do :; done &
       [1] 6537
       root@twins:~# while :; do :; done &
       [2] 6538
       root@twins:~# while :; do :; done &
       [3] 6539
       root@twins:~# while :; do :; done &
       [4] 6540
      
       root@twins:/home/peter# ./latt -c4 sleep 4
       Entries: 48 (clients=4)
      
       Averages:
       ------------------------------
              Max          4750 usec
              Avg           497 usec
              Stdev         737 usec
      
       root@twins:/home/peter# echo WAKEUP_RUNNING > /debug/sched_features
      
       root@twins:/home/peter# ./latt -c4 sleep 4
       Entries: 48 (clients=4)
      
       Averages:
       ------------------------------
              Max            14 usec
              Avg             5 usec
              Stdev           3 usec
      
      Disabled by default - needs more testing.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: default avatarMike Galbraith <efault@gmx.de>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      LKML-Reference: <new-submission>
      ad4b78bb
  2. 16 Sep, 2009 2 commits
    • Peter Zijlstra's avatar
      sched: Optimize cgroup vs wakeup a bit · 3b640894
      Peter Zijlstra authored
      
      We don't need to call update_shares() for each domain we iterate,
      just got the largets one.
      
      However, we should call it before wake_affine() as well, so that
      that can use up-to-date values too.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3b640894
    • Ingo Molnar's avatar
      sched: Implement a gentler fair-sleepers feature · 51e0304c
      Ingo Molnar authored
      
      Add back FAIR_SLEEPERS and GENTLE_FAIR_SLEEPERS.
      
      FAIR_SLEEPERS is the old logic: credit sleepers with their sleep time.
      
      GENTLE_FAIR_SLEEPERS dampens this a bit: 50% of their sleep time gets
      credited.
      
      The hope here is to still give the benefits of fair-sleepers logic
      (quick wakeups, etc.) while not allow them to have 100% of their
      sleep time as if they were running.
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      51e0304c
  3. 15 Sep, 2009 6 commits
  4. 10 Sep, 2009 1 commit
    • Ingo Molnar's avatar
      sched: Disable NEW_FAIR_SLEEPERS for now · 3f2aa307
      Ingo Molnar authored
      
      Nikos Chantziaras and Jens Axboe reported that turning off
      NEW_FAIR_SLEEPERS improves desktop interactivity visibly.
      
      Nikos described his experiences the following way:
      
        " With this setting, I can do "nice -n 19 make -j20" and
          still have a very smooth desktop and watch a movie at
          the same time.  Various other annoyances (like the
          "logout/shutdown/restart" dialog of KDE not appearing
          at all until the background fade-out effect has finished)
          are also gone.  So this seems to be the single most
          important setting that vastly improves desktop behavior,
          at least here. "
      
      Jens described it the following way, referring to a 10-seconds
      xmodmap scheduling delay he was trying to debug:
      
        " Then I tried switching NO_NEW_FAIR_SLEEPERS on, and then
          I get:
      
          Performance counter stats for 'xmodmap .xmodmap-carl':
      
               9.009137  task-clock-msecs         #      0.447 CPUs
                     18  context-switches         #      0.002 M/sec
                      1  CPU-migrations           #      0.000 M/sec
                    315  page-faults              #      0.035 M/sec
      
          0.020167093  seconds time elapsed
      
          Woot! "
      
      So disable it for now. In perf trace output i can see weird
      delta timestamps:
      
        cc1-9943  [001]  2802.059479616: sched_stat_wait: task: as:9944 wait: 2801938766276 [ns]
      
      That nsec field is not supposed to be that large. More digging
      is needed - but lets turn it off while the real bug is found.
      Reported-by: default avatarNikos Chantziaras <realnc@arcor.de>
      Tested-by: default avatarNikos Chantziaras <realnc@arcor.de>
      Reported-by: default avatarJens Axboe <jens.axboe@oracle.com>
      Tested-by: default avatarJens Axboe <jens.axboe@oracle.com>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      LKML-Reference: <4AA93D34.8040500@arcor.de>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      3f2aa307
  5. 15 Jan, 2009 1 commit
    • Peter Zijlstra's avatar
      sched: prefer wakers · e52fb7c0
      Peter Zijlstra authored
      Prefer tasks that wake other tasks to preempt quickly. This improves
      performance because more work is available sooner.
      
      The workload that prompted this patch was a kernel build over NFS4 (for some
      curious and not understood reason we had to revert commit:
      18de9735
      
       to make any progress at all)
      
      Without this patch a make -j8 bzImage (of x86-64 defconfig) would take
      3m30-ish, with this patch we're down to 2m50-ish.
      
      psql-sysbench/mysql-sysbench show a slight improvement in peak performance as
      well, tbench and vmark seemed to not care.
      
      It is possible to improve upon the build time (to 2m20-ish) but that seriously
      destroys other benchmarks (just shows that there's more room for tinkering).
      
      Much thanks to Mike who put in a lot of effort to benchmark things and proved
      a worthy opponent with a competing patch.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: default avatarMike Galbraith <efault@gmx.de>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      e52fb7c0
  6. 14 Jan, 2009 1 commit
    • Peter Zijlstra's avatar
      mutex: implement adaptive spinning · 0d66bf6d
      Peter Zijlstra authored
      Change mutex contention behaviour such that it will sometimes busy wait on
      acquisition - moving its behaviour closer to that of spinlocks.
      
      This concept got ported to mainline from the -rt tree, where it was originally
      implemented for rtmutexes by Steven Rostedt, based on work by Gregory Haskins.
      
      Testing with Ingo's test-mutex application (http://lkml.org/lkml/2006/1/8/50
      
      )
      gave a 345% boost for VFS scalability on my testbox:
      
       # ./test-mutex-shm V 16 10 | grep "^avg ops"
       avg ops/sec:               296604
      
       # ./test-mutex-shm V 16 10 | grep "^avg ops"
       avg ops/sec:               85870
      
      The key criteria for the busy wait is that the lock owner has to be running on
      a (different) cpu. The idea is that as long as the owner is running, there is a
      fair chance it'll release the lock soon, and thus we'll be better off spinning
      instead of blocking/scheduling.
      
      Since regular mutexes (as opposed to rtmutexes) do not atomically track the
      owner, we add the owner in a non-atomic fashion and deal with the races in
      the slowpath.
      
      Furthermore, to ease the testing of the performance impact of this new code,
      there is means to disable this behaviour runtime (without having to reboot
      the system), when scheduler debugging is enabled (CONFIG_SCHED_DEBUG=y),
      by issuing the following command:
      
       # echo NO_OWNER_SPIN > /debug/sched_features
      
      This command re-enables spinning again (this is also the default):
      
       # echo OWNER_SPIN > /debug/sched_features
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      0d66bf6d
  7. 05 Nov, 2008 1 commit
    • Peter Zijlstra's avatar
      sched: backward looking buddy · 4793241b
      Peter Zijlstra authored
      
      Impact: improve/change/fix wakeup-buddy scheduling
      
      Currently we only have a forward looking buddy, that is, we prefer to
      schedule to the task we last woke up, under the presumption that its
      going to consume the data we just produced, and therefore will have
      cache hot benefits.
      
      This allows co-waking producer/consumer task pairs to run ahead of the
      pack for a little while, keeping their cache warm. Without this, we
      would interleave all pairs, utterly trashing the cache.
      
      This patch introduces a backward looking buddy, that is, suppose that
      in the above scenario, the consumer preempts the producer before it
      can go to sleep, we will therefore miss the wakeup from consumer to
      producer (its already running, after all), breaking the cycle and
      reverting to the cache-trashing interleaved schedule pattern.
      
      The backward buddy will try to schedule back to the task that woke us
      up in case the forward buddy is not available, under the assumption
      that the last task will be the one with the most cache hot task around
      barring current.
      
      This will basically allow a task to continue after it got preempted.
      
      In order to avoid starvation, we allow either buddy to get wakeup_gran
      ahead of the pack.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: default avatarMike Galbraith <efault@gmx.de>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      4793241b
  8. 20 Oct, 2008 1 commit
  9. 22 Sep, 2008 2 commits
    • Ingo Molnar's avatar
      sched: turn off WAKEUP_OVERLAP · f681bbd6
      Ingo Molnar authored
      
      WAKEUP_OVERLAP is not a winner on a 16way box, running psql+sysbench:
      
             .27-rc7-NO_WAKEUP_OVERLAP  .27-rc7-WAKEUP_OVERLAP
      -------------------------------------------------
          1:             694              811    +14.39%
          2:            1454             1427    -1.86%
          4:            3017             3070    +1.70%
          8:            5694             5808    +1.96%
         16:           10592            10612    +0.19%
         32:            9693             9647    -0.48%
         64:            8507             8262    -2.97%
        128:            8402             7087    -18.55%
        256:            8419             5124    -64.30%
        512:            7990             3671    -117.62%
      -------------------------------------------------
        SUM:           64466            55524    -16.11%
      
      ... so turn it off by default.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      f681bbd6
    • Peter Zijlstra's avatar
      sched: wakeup preempt when small overlap · 15afe09b
      Peter Zijlstra authored
      
      Lin Ming reported a 10% OLTP regression against 2.6.27-rc4.
      
      The difference seems to come from different preemption agressiveness,
      which affects the cache footprint of the workload and its effective
      cache trashing.
      
      Aggresively preempt a task if its avg overlap is very small, this should
      avoid the task going to sleep and find it still running when we schedule
      back to it - saving a wakeup.
      Reported-by: default avatarLin Ming <ming.m.lin@intel.com>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      15afe09b
  10. 21 Aug, 2008 1 commit
  11. 27 Jun, 2008 5 commits
  12. 10 Jun, 2008 1 commit
  13. 19 Apr, 2008 1 commit