DragonFly kernel List (threaded) for 2004-01
Re: chooseproc cpu affinity question
:> Yes, the check for only the next one is intended. Putting a iterative
:> loop in the middle of the process selection code's critical path is
:> not very condusive to performance.
:Well, but with n CPUs your chance is 1/n you hit one.
:And the expected length k of the loop is << runqcount.
:How expensive is an IPI plus cache invalidation plus cache misses
:compared to k times this loop ?
:I don't know and I don't want to step on anyones toes.
:I just think it's worth a thought.
:Maybe trying min(n, runqcount) times or so would do the job...
Well, there are a lot of factors here, such as the timing of the
scheduling event and which cpu's scheduler checks the run queue
first. Affinity usually runs into the most trouble with
programs which block for short periods of time, such as when doing
read I/O, which a scheduling runq check (of any type) would not do
a very good job of detecting. The single-node lookahead chance does
far better then 1/n because while you have more cpus to play with,
you also have fewer processes on the runq (and more already running on
a particular cpu), and you have timing effects that work in your favor.
For example, take a look at setrunqueue() in kern_switch.c. setrunqueue()
is responsible not only for entering the process onto the run queue, but
also for waking up the 'best' cpu for the scheduling of that process.
This is where the meat of DFly's affinity actually happens... it wakes
up the cpu that ran the process before in order to give that cpu a
chance to schedule the newly runnable process before other cpus. It
does not, however, prevent other cpus from scheduling the newly
runnable process. You wind up with a weighted statistical effect that
should, in most cases, schedule the process on a cpu with affinity to
that process but which does not prevent the process from being scheduled
by some other cpu if the target cpu is too busy with other processes.
In short, there are a lot of factors involved and some real life
statistical testing would need to be done to find the right algorithms
to schedule the processes optimally. It isn't a good idea to spend a lot
of cpu time on just one tiny aspect of the problem and, in fact, doing
so can introduce some pretty aweful degenerate situations.
For example, in a very heavily loaded system there might be hundreds of
processes on the run queues. The last thing you want to do is cause an
iterative loop to be executed every time one has a wakeup() event. That
could easily lead to billions of wasted cpu cycles.