[illumos-Developer] Dedup

George Wilson gwilson at zfsmail.com
Mon Jul 18 18:34:01 PDT 2011


There is no in-memory dedup (at least today). The dedup table which we 
store in the arc is just metadata required for writing and freeing 
blocks. Those are the two code paths where dedup plays a big a role.

Thanks,
George

On 7/18/11 11:22 AM, Marcelo Leal wrote:
> Hello Garrett,
>   My question was not "exactly" this... actually, i do know about the
> requirements from dedup about memory.
>   My question is if the hit in memory is not "deduped".
>   Look at this comment on the post that i mentioned:
>
> -------------
> "Do I read "However if a subsequent read is for a duplicate block
> which happens to be in the pool ARC cache ... only a much faster copy
> of duplicate block will be necessary" correctly as "Data is not held
> deduped in the ARC cache"?
>
> I.e. using the S7000 to hold OS image in a VDI farm might dedup the OS
> images very effectively but would not be very effective in serving the
> OS images? The cache would have to be scaled linearly with the number
> of VMs running"?
> -------------
>
> And the answer from Roch was:
>
> ----------
> "Moritz, That's correct".
> ----------
>
>
>   Leal
> [ http://www.eall.com.br/blog ]
> -------=- pOSix rules -=-------
>
>
>
> 2011/7/18 Garrett D'Amore<garrett at nexenta.com>:
>> If you mean, does dedup still need a bunch of system memory *or* an SSD backed L2ARC (and a still large, although possibly smaller system memory size), then the answer is *yes*.  The DDT still wants to live in the ARC, and you really don't want to have to go to spinning rust to access it.
>>
>> There are many considerations when enabling dedup, and in general, you should be very careful before you enable it.  If you are going to only reduce your storage requirements by half or less, then I would not use it at all.  And I would never use it unless I had a boatload of memory to begin with.  (One possible exception would be if my entire data set was located on SSDs.  In that case, dedup even with a lower dedup ratio might still make sense, given the precious nature of bytes located on SSD.)
>>
>>   -- Garrett D'Amore
>>
>> On Jul 18, 2011, at 8:12 AM, "Marcelo Leal"<sunos.x86 at gmail.com>  wrote:
>>
>>> Hello there,
>>> I was reading this (old) post from Roch:
>>>
>>> http://blogs.oracle.com/roch/entry/dedup_performance_considerations1
>>>
>>> And looking at the comment, that is still true for the actual
>>> implementaion of dedup on ZFS?
>>> I mean, the performance enhancement from deduped block on memory, i
>>> think can be better than the space utilization on disk. ;-)
>>> So, i'm asking if that drawback persists on the updated code today.
>>>
>>> Thanks!
>>>
>>> Leal
>>> [ http://www.eall.com.br/blog ]
>>> -------=- pOSix rules -=-------
>>>
>>> _______________________________________________
>>> Developer mailing list
>>> Developer at lists.illumos.org
>>> http://lists.illumos.org/m/listinfo/developer
> _______________________________________________
> Developer mailing list
> Developer at lists.illumos.org
> http://lists.illumos.org/m/listinfo/developer




More information about the Developer mailing list