[illumos-Developer] MUCH better --> zil for zvol TRUNCATE (and iSCSI UNMAP) working...
George Wilson
gwilson at zfsmail.com
Wed Mar 2 14:40:21 PST 2011
On 3/2/11 12:36 PM, Dan McDonald wrote:
> On Wed, Mar 02, 2011 at 10:34:25AM -0800, George Wilson wrote:
>>> I dunno... zvol_replay_write() has the dmu_tx_* stuff. If I rip it out of
>>> replay_truncate, should I also rip it out of replay_write() as well?
>> No, in the case of zvol_replay_write() you're updating the contents
>> of the zvol in one transaction so it's necessary to create the
>> transaction. In the case of zvol_replay_truncate() the
>> dmu_free_long_range() is going to create the transactions for you.
>> That's why I don't believe you need the dmu_tx_* stuff for that
>> particular op.
> In the replay, this makes sense. (I just now noticed that dmu_write() takes
> tx as an argument.) Naturally the DKIOCFREE code still needs all of the tx
> bits, though, so it can log to the ZIL.
>
> And combining two threads...
>
>
> On Wed, Mar 02, 2011 at 10:29:23AM -0800, George Wilson wrote:
>>>> The one thing that the txg_wait_synced() provides is for callers to
>>>> wait for all the blocks to be freed before continuing. I don't know
>>>> if this is a requirement or not.
>>> Your description of txg_wait_synced() behavior does match up with the
>>> WRITEBACK_CACHE_DISABLED, but Eric points out:
>> I see 3 slightly related controls for DKIOCFREE:
>>
>> 1). is write-cache enabled (ZVOL_WCE)?
>> 2). is sync-always enabled (ZVOL_SYNC_ALWAYS)?
> These two flags/values (it's ZFS_SYNC_ALWAYS...) appear to be already
> potentially present in zvol state. I therefore do not need to *add* these,
> so much as check for them.
>
>> 3). do you want to wait for the frees to complete (DF_WAIT_SYNC)?
> This comes via the ioctl.
>
>> I see 1 and 2 above as ways to control whether or not a zil_commit()
>> is required as part of the free. I see #3 as way to control do you
>> want the operation to not return until the free is complete and all
>> metadata has been updated.
>>
>> My initial comments (long time ago) mentioned that this operation
>> should always be log to the zil. If we want to honor
>> SL_WRITEBACK_CACHE_DISABLED then we should make the zil_commit()
>> optional based on #1 and 2.
> Okayyyy..
>
>>> I wonder if the principal of least surprise might apply here? That would
>>> argue for implementing DF_WAIT_SYNC by using txg_wait_synced().
>> I think there is a need for a "wait" flag and maybe there's a need to
>> have all three flags for completeness.
> Right now I have:
>
> error = dmu_tx_assign(tx, TXG_WAIT);
> if (error != 0) {
> dmu_tx_abort(tx);
> } else {
> zvol_log_truncate(zv, tx, df.df_start,
> df.df_length, B_TRUE);
> error = dmu_free_long_range(zv->zv_objset, ZVOL_OBJ,
> df.df_start, df.df_length);
> dmu_tx_commit(tx);
> }
>
> zfs_range_unlock(rl);
>
> ---> if (error == 0)
> ---> zil_commit(zv->zv_zilog, ZVOL_OBJ);
>
> Where I have arrows should probably change then? Perhaps something like:
>
> if (error == 0&& (zv->zv_flags& ZVOL_WCE)&&
> zv->zv_objset->os_sync == ZFS_SYNC_ALWAYS) {
> zil_commit(zv->zv_zilog, ZVOL_OBJ);
> if (df.df_flags& DF_WAIT_SYNC) {
> txg_wait_synced(dmu_objset_pool(zv->zv_objset,
> 0);
> }
> }
I think that it should be:
if (error == 0) {
/*
* If the write-cache is disabled or 'sync' property is set to
'always' then
* treat this as a synchronous operation (i.e. commit to zil).
*/
if (!(zv->zv_flags & ZVOL_WCE) || (zv->zv_objset->os_sync ==
ZFS_SYNC_ALWAYS))
zil_commit(...)
if (df.df_flags & DF_WAIT_SYNC)
txg_wait_synced(...)
}
- George
> Does this make sense to you? I hope this is the last bits of tweaking I'll
> need. I think I'm close to freezing, then I will run my tests again, and
> then final-RTI/push!
>
> Thanks,
> Dan
>
> _______________________________________________
> Developer mailing list
> Developer at lists.illumos.org
> http://lists.illumos.org/m/listinfo/developer
More information about the Developer
mailing list