[illumos-Developer] Dedup

Marcelo Leal sunos.x86 at gmail.com
Tue Jul 19 07:32:55 PDT 2011


 Thanks! Just one more question... ;-)
 It's easy to think about this code in terms of empty storage, and
game starting. But, what about the reboot and actually re-creation of
this dedup table?
 What i can think is that the dedup table will be recreated from
scratch and so we can have a less effective dedup ratio (what i think
is not a problem at all). But curious to know if there is a solution
ZFS hackers did create to solve this.
 Thanks again!

 Leal
[ http://www.eall.com.br/blog ]
-------=- pOSix rules -=-------



2011/7/18 George Wilson <gwilson at zfsmail.com>:
> There is no in-memory dedup (at least today). The dedup table which we store
> in the arc is just metadata required for writing and freeing blocks. Those
> are the two code paths where dedup plays a big a role.
>
> Thanks,
> George
>
> On 7/18/11 11:22 AM, Marcelo Leal wrote:
>>
>> Hello Garrett,
>>  My question was not "exactly" this... actually, i do know about the
>> requirements from dedup about memory.
>>  My question is if the hit in memory is not "deduped".
>>  Look at this comment on the post that i mentioned:
>>
>> -------------
>> "Do I read "However if a subsequent read is for a duplicate block
>> which happens to be in the pool ARC cache ... only a much faster copy
>> of duplicate block will be necessary" correctly as "Data is not held
>> deduped in the ARC cache"?
>>
>> I.e. using the S7000 to hold OS image in a VDI farm might dedup the OS
>> images very effectively but would not be very effective in serving the
>> OS images? The cache would have to be scaled linearly with the number
>> of VMs running"?
>> -------------
>>
>> And the answer from Roch was:
>>
>> ----------
>> "Moritz, That's correct".
>> ----------
>>
>>
>>  Leal
>> [ http://www.eall.com.br/blog ]
>> -------=- pOSix rules -=-------
>>
>>
>>
>> 2011/7/18 Garrett D'Amore<garrett at nexenta.com>:
>>>
>>> If you mean, does dedup still need a bunch of system memory *or* an SSD
>>> backed L2ARC (and a still large, although possibly smaller system memory
>>> size), then the answer is *yes*.  The DDT still wants to live in the ARC,
>>> and you really don't want to have to go to spinning rust to access it.
>>>
>>> There are many considerations when enabling dedup, and in general, you
>>> should be very careful before you enable it.  If you are going to only
>>> reduce your storage requirements by half or less, then I would not use it at
>>> all.  And I would never use it unless I had a boatload of memory to begin
>>> with.  (One possible exception would be if my entire data set was located on
>>> SSDs.  In that case, dedup even with a lower dedup ratio might still make
>>> sense, given the precious nature of bytes located on SSD.)
>>>
>>>  -- Garrett D'Amore
>>>
>>> On Jul 18, 2011, at 8:12 AM, "Marcelo Leal"<sunos.x86 at gmail.com>  wrote:
>>>
>>>> Hello there,
>>>> I was reading this (old) post from Roch:
>>>>
>>>> http://blogs.oracle.com/roch/entry/dedup_performance_considerations1
>>>>
>>>> And looking at the comment, that is still true for the actual
>>>> implementaion of dedup on ZFS?
>>>> I mean, the performance enhancement from deduped block on memory, i
>>>> think can be better than the space utilization on disk. ;-)
>>>> So, i'm asking if that drawback persists on the updated code today.
>>>>
>>>> Thanks!
>>>>
>>>> Leal
>>>> [ http://www.eall.com.br/blog ]
>>>> -------=- pOSix rules -=-------
>>>>
>>>> _______________________________________________
>>>> Developer mailing list
>>>> Developer at lists.illumos.org
>>>> http://lists.illumos.org/m/listinfo/developer
>>
>> _______________________________________________
>> Developer mailing list
>> Developer at lists.illumos.org
>> http://lists.illumos.org/m/listinfo/developer
>
>
> _______________________________________________
> Developer mailing list
> Developer at lists.illumos.org
> http://lists.illumos.org/m/listinfo/developer
>



More information about the Developer mailing list