[illumos-Developer] Dedup

George Wilson gwilson at zfsmail.com
Tue Jul 19 13:10:53 PDT 2011


We don't need to recreate the dedup table from scratch since the dedup 
table is persistent on disk. Instead zfs needs to start prefetching the 
dedup table into cache. This is something that should be done and would 
be a pretty big performance win.

- George

On 7/19/11 11:52 AM, Marcelo Leal wrote:
>   For sure, that answer, thanks!
>   But don't you agree that "to solve" the performance impact on a cold
> start, would be a option to recreate the dedup table from scratch?
>   I mean, in a 80/20 solution, i think would be fine.
>   Doing a read-on-disk for each write is not *good*, and as the dedup
> is something *variable* (can be enabled/disabled in run time), there
> is no problem to not have 100% deduped. I mean, i would prefer that
> instead of a bigger MTTR when we have a downtime. Mostly because that
> is unpredictable. But i don't know if i'm on the "rule" in this
> situation, or in the exception list. ;-)
>
>   Leal
> [ http://www.eall.com.br/blog ]
> -------=- pOSix rules -=-------
>
>
>
> 2011/7/19 George Wilson<gwilson at zfsmail.com>:
>> On 7/19/11 7:32 AM, Marcelo Leal wrote:
>>>   Thanks! Just one more question... ;-)
>>>   It's easy to think about this code in terms of empty storage, and
>>> game starting. But, what about the reboot and actually re-creation of
>>> this dedup table?
>> The dedup table is persistent so you don't need to recreate it but on a
>> reboot the dedup table will no longer reside in-core so you'll need to start
>> faulting it in. Unfortunately this happens as new writes or frees take place
>> which means that performance will suffer until the dedup table is cached. We
>> had talked about doing some prefetching of the whole table (we prefetch
>> entries as needed to reduce the performance hit) but never implemented that.
>>
>> Does that answer your question?
>>
>> Thanks,
>> George
>>>   What i can think is that the dedup table will be recreated from
>>> scratch and so we can have a less effective dedup ratio (what i think
>>> is not a problem at all). But curious to know if there is a solution
>>> ZFS hackers did create to solve this.
>>>   Thanks again!
>>>
>>>   Leal
>>> [ http://www.eall.com.br/blog ]
>>> -------=- pOSix rules -=-------
>>>
>>>
>>>
>>> 2011/7/18 George Wilson<gwilson at zfsmail.com>:
>>>> There is no in-memory dedup (at least today). The dedup table which we
>>>> store
>>>> in the arc is just metadata required for writing and freeing blocks.
>>>> Those
>>>> are the two code paths where dedup plays a big a role.
>>>>
>>>> Thanks,
>>>> George
>>>>
>>>> On 7/18/11 11:22 AM, Marcelo Leal wrote:
>>>>> Hello Garrett,
>>>>>   My question was not "exactly" this... actually, i do know about the
>>>>> requirements from dedup about memory.
>>>>>   My question is if the hit in memory is not "deduped".
>>>>>   Look at this comment on the post that i mentioned:
>>>>>
>>>>> -------------
>>>>> "Do I read "However if a subsequent read is for a duplicate block
>>>>> which happens to be in the pool ARC cache ... only a much faster copy
>>>>> of duplicate block will be necessary" correctly as "Data is not held
>>>>> deduped in the ARC cache"?
>>>>>
>>>>> I.e. using the S7000 to hold OS image in a VDI farm might dedup the OS
>>>>> images very effectively but would not be very effective in serving the
>>>>> OS images? The cache would have to be scaled linearly with the number
>>>>> of VMs running"?
>>>>> -------------
>>>>>
>>>>> And the answer from Roch was:
>>>>>
>>>>> ----------
>>>>> "Moritz, That's correct".
>>>>> ----------
>>>>>
>>>>>
>>>>>   Leal
>>>>> [ http://www.eall.com.br/blog ]
>>>>> -------=- pOSix rules -=-------
>>>>>
>>>>>
>>>>>
>>>>> 2011/7/18 Garrett D'Amore<garrett at nexenta.com>:
>>>>>> If you mean, does dedup still need a bunch of system memory *or* an SSD
>>>>>> backed L2ARC (and a still large, although possibly smaller system
>>>>>> memory
>>>>>> size), then the answer is *yes*.  The DDT still wants to live in the
>>>>>> ARC,
>>>>>> and you really don't want to have to go to spinning rust to access it.
>>>>>>
>>>>>> There are many considerations when enabling dedup, and in general, you
>>>>>> should be very careful before you enable it.  If you are going to only
>>>>>> reduce your storage requirements by half or less, then I would not use
>>>>>> it at
>>>>>> all.  And I would never use it unless I had a boatload of memory to
>>>>>> begin
>>>>>> with.  (One possible exception would be if my entire data set was
>>>>>> located on
>>>>>> SSDs.  In that case, dedup even with a lower dedup ratio might still
>>>>>> make
>>>>>> sense, given the precious nature of bytes located on SSD.)
>>>>>>
>>>>>>   -- Garrett D'Amore
>>>>>>
>>>>>> On Jul 18, 2011, at 8:12 AM, "Marcelo Leal"<sunos.x86 at gmail.com>
>>>>>>   wrote:
>>>>>>
>>>>>>> Hello there,
>>>>>>> I was reading this (old) post from Roch:
>>>>>>>
>>>>>>> http://blogs.oracle.com/roch/entry/dedup_performance_considerations1
>>>>>>>
>>>>>>> And looking at the comment, that is still true for the actual
>>>>>>> implementaion of dedup on ZFS?
>>>>>>> I mean, the performance enhancement from deduped block on memory, i
>>>>>>> think can be better than the space utilization on disk. ;-)
>>>>>>> So, i'm asking if that drawback persists on the updated code today.
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>> Leal
>>>>>>> [ http://www.eall.com.br/blog ]
>>>>>>> -------=- pOSix rules -=-------
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Developer mailing list
>>>>>>> Developer at lists.illumos.org
>>>>>>> http://lists.illumos.org/m/listinfo/developer
>>>>> _______________________________________________
>>>>> Developer mailing list
>>>>> Developer at lists.illumos.org
>>>>> http://lists.illumos.org/m/listinfo/developer
>>>> _______________________________________________
>>>> Developer mailing list
>>>> Developer at lists.illumos.org
>>>> http://lists.illumos.org/m/listinfo/developer
>>>>
>>> _______________________________________________
>>> Developer mailing list
>>> Developer at lists.illumos.org
>>> http://lists.illumos.org/m/listinfo/developer
>>
>> _______________________________________________
>> Developer mailing list
>> Developer at lists.illumos.org
>> http://lists.illumos.org/m/listinfo/developer
>>
> _______________________________________________
> Developer mailing list
> Developer at lists.illumos.org
> http://lists.illumos.org/m/listinfo/developer




More information about the Developer mailing list