[illumos-Developer] initial review: bombproof taskq dispatch

Wed May 4 15:59:17 PDT 2011

So one of the problems I have seen with taskq_dispatch is that it is
either prone to failure (ENOMEM), or you have to be willing to sleep.   

In fact, ZFS gets this wrong, as it uses TQ_SLEEP in a case that is
potentially called by an interrupt, so there's a latent bug there.  (And
I suspect its not the only place.)  Dealing with dispatch failures can
be a major PITA for anyone trying to dispatch from interrupt context.
This makes life miserable for certain of us who write device drivers
that have to do things like this. :-)

But you could imagine (I did!  But this idea is used in *BSD as well) a
version that used preallocated taskq structures to have a very light
weight and totally bomb-proof implementation.  Here is an initial draft
with zio's taskqs converted to use it:

http://mexico.purplecow.org/gdamore/webrev/taskq/

I only converted zio although there are other opportunities.  The reason
for this is that

a) zio's use of taskq_dispatch showed up as very hot on some significant
stressful runs in storage -- so this will immediately yield a lighter
load on CPUs

b) zio is complex enough, that if it works for ZIO, then I can probably
make it work for a bunch of other cases.

The drawback here is that zio's code is imported into userland, and I
didn't want to expose these internals into the userland code for now, so
I have introduced some ugly ifdefs.  I really don't like how this code
is shared with userland, but that's a battle for a different day.

The other concern is that my design depends on zio's only being
outstanding on a single taskq at a time.  I *think* other aspects of
zio_execute rely on this assumption being true, but if I'm mistaken, I'd
really like to know it.  (I added some assertions to check for this
specifically, and I'll run a debug version in a full stress run to
double confirm.)

Initial testing so far suggests it *works*, but I've not really hit this
hard yet, and I certainly have not done any performance analysis yet.
That will be forthcoming.

What I want to do is give this code out for early review to give
interested parties a chance to give feedback as early into the process
as possible.

Once this is all done, and integrated, I will probably go on a hunt to
find other likely candidates -- I *know* there are a bunch of places
that code could be simplified with this interface, and failure cases
(and associated handling) simply removed. Probably there will be some
very modest (maybe not measurable) performance benefits for those cases
as well. :-)

Thanks.

	- Garrett