[illumos-Discuss] Block pointer rewrite?
Haudy Kazemi
kaze0010 at umn.edu
Mon Feb 7 14:37:15 PST 2011
On 2/6/2011 8:16 PM, Garrett D'Amore wrote:
> On Sun, 2011-02-06 at 15:09 +0100, Roy Sigurd Karlsbakk wrote:
>> Hi all
>>
>> Is it possibly, or likely, that OI/Illumos will ever get block pointer rewrite? I have a 50TB system that was almost filled up with its initial 30TB of storage before we added more drives. The problem is, those VDEVs are still full, and the system is _slow_, and I don't really want to make a backup/restore of those 35TB or so on it. What will it take to make VDEV balancing work on OI/Illumos?
> Its possible, and possibly even likely, that we will get this at some
> point. Its mostly a matter of time/investment, and priorities.
>
> - Garrett
I have a scriptable idea that offers a way to re-balance data on VDEVs
without using block pointer rewrite and without doing a full
backup/restore. I haven't tested this yet. Comments welcome.
Conditions/Prerequisites/Caveats:
1.) a second (temporary) storage pool that is at least large enough to
hold the single largest file in your collection, and preferably larger
for group batches.
2.) no requirement to keep old filesystem snapshots (i.e. can delete all
old snapshots)
3.) a period of time where the filesystem can effectively be unavailable
to applications and users (because a file they need might be temporarily
moved off)
4.) some space needs to be open on each VDEV that is part of the pool.
This is easiest if the original VDEVs were never completely full, and
any additional VDEVs have not been filled either. If the original VDEVs
were completely filled, then it is necessary to first apply this
procedure to any files that were on the original VDEVs, and then apply
it to any files written to the added VDEVs.
5.) VDEVs consisting of multiple device sizes cannot be fully balanced.
E.g. a pool consisting of mixed set of 500gb and 1TB drives. The
smaller drives will fill first, and then performance will decrease.
Steps:
1.) move file or group of files to temporary storage pool. If you know
which files were only on some of the VDEVs (i.e. the disks that got full
before additional capacity was added to the pool), move those files
first. Hopefully the newer vdevs have some free space on them. If
multiple sets of VDEVs have been added (and filled) before starting this
procedure, try to make sure some space has been freed up on all by
moving files away from each VDEV. File timestamps combined with the
dates upon which the VDEVs were added should help narrow down which
files are most likely on which VDEV.
2.) clear all old snapshots (to eliminate old pointers to the data
blocks for the files that were just moved). (Maybe also scrub?)
3.) move file or group of files back to the main pool. ZFS should
stripe these rewritten files across the VDEVs that are not completely full.
4.) repeat from beginning if there were multiple sets of VDEVs that were
completely filled up after they were added to the pool. (repeat for
each set of files that were written to each set of now full VDEVs). Or
find another way to ensure there is some space freed up on each VDEV.
More information about the Discuss
mailing list