[illumos-Developer] initial review: bombproof taskq dispatch

George Wilson gwilson at zfsmail.com
Thu Jul 7 07:21:00 PDT 2011


Garrett,

A few comments:

- With your change do we still need to use TASKQ_PREPOPULATE in 
spa_taskq_create()?
- I would really love for the #ifdef _KERNEL lines to go away before 
integrating this, any chance that can happen? Let me know if you need 
help with that.
- In taskq_dispatch(), lines 1136 - 1162 is it possible to make this a 
common routine so that both taskq_dispatch() and taskq_dispatch_ent() 
can both use?
- In taskq_thread(), line# 1582, this comment is confusing since we 
don't really talk about prealloc'd tasks and may get confused with 
prealloced taskqs which are mentioned in this file. I have to admit that 
this interface is a little funky and I'm wondering if we shouldn't try 
to better define this. Maybe something like taskq_ent_create() and 
taskq_ent_destroy() that can be called by external consumers?
- In taskq_thread(), lines 1589 - 1592, you manipulate tqe pointers 
without the tq_lock held, is that intentional?
- does mdb's ::taskq still show the correct information with this change?

Thanks,
George


On 6/30/11 10:10 AM, Garrett D'Amore wrote:
> So, I'm getting ready to submit an RTI on this... but I'd like to give
> folks another, more current, chance to read the webrev.
>
> We believe this change resolves a key performance problem with highly
> loaded systems by reducing overall memory pressure and latency of some
> operations.
>
> There is a similar change to update a taskq_dispatch in the COMSTAR
> stack which I'll be submitting soon, which depends on this code.
>
> Please let me know if you have any feedback on this.  Thanks.
>
> 	- Garrett
>
>
> On Wed, 2011-05-04 at 15:59 -0700, Garrett D'Amore wrote:
>> So one of the problems I have seen with taskq_dispatch is that it is
>> either prone to failure (ENOMEM), or you have to be willing to sleep.
>>
>> In fact, ZFS gets this wrong, as it uses TQ_SLEEP in a case that is
>> potentially called by an interrupt, so there's a latent bug there.  (And
>> I suspect its not the only place.)  Dealing with dispatch failures can
>> be a major PITA for anyone trying to dispatch from interrupt context.
>> This makes life miserable for certain of us who write device drivers
>> that have to do things like this. :-)
>>
>> But you could imagine (I did!  But this idea is used in *BSD as well) a
>> version that used preallocated taskq structures to have a very light
>> weight and totally bomb-proof implementation.  Here is an initial draft
>> with zio's taskqs converted to use it:
>>
>> http://mexico.purplecow.org/gdamore/webrev/taskq/
>>
>> I only converted zio although there are other opportunities.  The reason
>> for this is that
>>
>> a) zio's use of taskq_dispatch showed up as very hot on some significant
>> stressful runs in storage -- so this will immediately yield a lighter
>> load on CPUs
>>
>> b) zio is complex enough, that if it works for ZIO, then I can probably
>> make it work for a bunch of other cases.
>>
>> The drawback here is that zio's code is imported into userland, and I
>> didn't want to expose these internals into the userland code for now, so
>> I have introduced some ugly ifdefs.  I really don't like how this code
>> is shared with userland, but that's a battle for a different day.
>>
>> The other concern is that my design depends on zio's only being
>> outstanding on a single taskq at a time.  I *think* other aspects of
>> zio_execute rely on this assumption being true, but if I'm mistaken, I'd
>> really like to know it.  (I added some assertions to check for this
>> specifically, and I'll run a debug version in a full stress run to
>> double confirm.)
>>
>> Initial testing so far suggests it *works*, but I've not really hit this
>> hard yet, and I certainly have not done any performance analysis yet.
>> That will be forthcoming.
>>
>> What I want to do is give this code out for early review to give
>> interested parties a chance to give feedback as early into the process
>> as possible.
>>
>> Once this is all done, and integrated, I will probably go on a hunt to
>> find other likely candidates -- I *know* there are a bunch of places
>> that code could be simplified with this interface, and failure cases
>> (and associated handling) simply removed. Probably there will be some
>> very modest (maybe not measurable) performance benefits for those cases
>> as well. :-)
>>
>> Thanks.
>>
>> 	- Garrett
>>
>>
>> _______________________________________________
>> Developer mailing list
>> Developer at lists.illumos.org
>> http://lists.illumos.org/m/listinfo/developer
>
>
> _______________________________________________
> Developer mailing list
> Developer at lists.illumos.org
> http://lists.illumos.org/m/listinfo/developer




More information about the Developer mailing list