[illumos-Developer] initial review: bombproof taskq dispatch
George Wilson
gwilson at zfsmail.com
Thu Jul 7 07:21:00 PDT 2011
Garrett,
A few comments:
- With your change do we still need to use TASKQ_PREPOPULATE in
spa_taskq_create()?
- I would really love for the #ifdef _KERNEL lines to go away before
integrating this, any chance that can happen? Let me know if you need
help with that.
- In taskq_dispatch(), lines 1136 - 1162 is it possible to make this a
common routine so that both taskq_dispatch() and taskq_dispatch_ent()
can both use?
- In taskq_thread(), line# 1582, this comment is confusing since we
don't really talk about prealloc'd tasks and may get confused with
prealloced taskqs which are mentioned in this file. I have to admit that
this interface is a little funky and I'm wondering if we shouldn't try
to better define this. Maybe something like taskq_ent_create() and
taskq_ent_destroy() that can be called by external consumers?
- In taskq_thread(), lines 1589 - 1592, you manipulate tqe pointers
without the tq_lock held, is that intentional?
- does mdb's ::taskq still show the correct information with this change?
Thanks,
George
On 6/30/11 10:10 AM, Garrett D'Amore wrote:
> So, I'm getting ready to submit an RTI on this... but I'd like to give
> folks another, more current, chance to read the webrev.
>
> We believe this change resolves a key performance problem with highly
> loaded systems by reducing overall memory pressure and latency of some
> operations.
>
> There is a similar change to update a taskq_dispatch in the COMSTAR
> stack which I'll be submitting soon, which depends on this code.
>
> Please let me know if you have any feedback on this. Thanks.
>
> - Garrett
>
>
> On Wed, 2011-05-04 at 15:59 -0700, Garrett D'Amore wrote:
>> So one of the problems I have seen with taskq_dispatch is that it is
>> either prone to failure (ENOMEM), or you have to be willing to sleep.
>>
>> In fact, ZFS gets this wrong, as it uses TQ_SLEEP in a case that is
>> potentially called by an interrupt, so there's a latent bug there. (And
>> I suspect its not the only place.) Dealing with dispatch failures can
>> be a major PITA for anyone trying to dispatch from interrupt context.
>> This makes life miserable for certain of us who write device drivers
>> that have to do things like this. :-)
>>
>> But you could imagine (I did! But this idea is used in *BSD as well) a
>> version that used preallocated taskq structures to have a very light
>> weight and totally bomb-proof implementation. Here is an initial draft
>> with zio's taskqs converted to use it:
>>
>> http://mexico.purplecow.org/gdamore/webrev/taskq/
>>
>> I only converted zio although there are other opportunities. The reason
>> for this is that
>>
>> a) zio's use of taskq_dispatch showed up as very hot on some significant
>> stressful runs in storage -- so this will immediately yield a lighter
>> load on CPUs
>>
>> b) zio is complex enough, that if it works for ZIO, then I can probably
>> make it work for a bunch of other cases.
>>
>> The drawback here is that zio's code is imported into userland, and I
>> didn't want to expose these internals into the userland code for now, so
>> I have introduced some ugly ifdefs. I really don't like how this code
>> is shared with userland, but that's a battle for a different day.
>>
>> The other concern is that my design depends on zio's only being
>> outstanding on a single taskq at a time. I *think* other aspects of
>> zio_execute rely on this assumption being true, but if I'm mistaken, I'd
>> really like to know it. (I added some assertions to check for this
>> specifically, and I'll run a debug version in a full stress run to
>> double confirm.)
>>
>> Initial testing so far suggests it *works*, but I've not really hit this
>> hard yet, and I certainly have not done any performance analysis yet.
>> That will be forthcoming.
>>
>> What I want to do is give this code out for early review to give
>> interested parties a chance to give feedback as early into the process
>> as possible.
>>
>> Once this is all done, and integrated, I will probably go on a hunt to
>> find other likely candidates -- I *know* there are a bunch of places
>> that code could be simplified with this interface, and failure cases
>> (and associated handling) simply removed. Probably there will be some
>> very modest (maybe not measurable) performance benefits for those cases
>> as well. :-)
>>
>> Thanks.
>>
>> - Garrett
>>
>>
>> _______________________________________________
>> Developer mailing list
>> Developer at lists.illumos.org
>> http://lists.illumos.org/m/listinfo/developer
>
>
> _______________________________________________
> Developer mailing list
> Developer at lists.illumos.org
> http://lists.illumos.org/m/listinfo/developer
More information about the Developer
mailing list