[illumos-Advocates] RTI 175 zfs vdev cache consumes excessive memory

Garrett D'Amore garrett at nexenta.com
Thu Apr 21 21:37:35 PDT 2011


This is a trivial change, already reviewed by the ZFS list -- we
basically disable the vdev cache by setting the size to zero.  (Diff
below.)

garrett at thinkpad{8}>  hg outgoing -v
running ssh anonhg at hg.illumos.org "hg -R illumos-gate serve --stdio"
comparing with ssh://anonhg@hg.illumos.org/illumos-gate
searching for changes

changeset:   13343:5211e1f2192b
tag:         tip
user:        Garrett D'Amore <garrett at nexenta.com>
date:        Thu Apr 21 21:34:04 2011 -0700

description:
	175 zfs vdev cache consumes excessive memory
	Reviewed by: George Wilson <george.wilson at delphix.com>
	Reviewed by: Eric Schrock <eric.schrock at delphix.com>

modified:
   usr/src/uts/common/fs/zfs/vdev_cache.c


hg pbchk:

garrett at thinkpad{9}> hg pbchk
Copyright check:
usr/src/uts/common/fs/zfs/vdev_cache.c: no copyright claim for current
year found

C style check:

Header format check:

Java style check:

Mapfile comment check:

File permission check:

Keywords check:

Comments check:

Checking for new tags:

Checking for multiple heads (or branches):

Checking for branch changes:

Checking for uncommitted changes:

Checking for merges:


Testing: We (Nexenta) have been running with this change for some time
now in some large deployments.  As well, I've verified this on my own
laptop.  Additionally, the tunable is zero on Solaris 11, so this has
been well exercised broadly.

Note that in the nightly build, the complaints are the result of my
naive copying of the closed tree, which broke links and symlinks,
resulting in some things being improperly populated in the proto area.
I'm not going to rebuild though, since it doesn't relate to this
particular change. 

garrett at thinkpad{10}> cat log/log.2011-04-21.03:05/mail_msg

==== Nightly distributed build started:   Thu Apr 21 00:57:19 PDT 2011
====
==== Nightly distributed build completed: Thu Apr 21 03:05:06 PDT 2011
====

==== Total build time ====

real    2:07:46

==== Build environment ====

/usr/bin/uname
SunOS thinkpad 5.11 qlc2322 i86pc i386 i86pc

/opt/SUNWspro/bin/dmake
dmake: Sun Distributed Make 7.8 SunOS_i386 Patch 126504-01 2007/07/19
number of concurrent jobs = 10

32-bit compiler
/opt/onbld/bin/i386/cw -_cc
cw version 1.29
primary: /opt/SUNWspro/bin/cc
cc: Sun C 5.9 SunOS_i386 Patch 124868-10 2009/04/30
shadow: /usr/sfw/bin/gcc
gcc (GCC) 3.4.3 (csl-sol210-3_4-20050802)

64-bit compiler
/opt/onbld/bin/i386/cw -_cc
cw version 1.29
primary: /opt/SUNWspro/bin/cc
cc: Sun C 5.9 SunOS_i386 Patch 124868-10 2009/04/30
shadow: /usr/sfw/bin/gcc
gcc (GCC) 3.4.3 (csl-sol210-3_4-20050802)

/usr/java/bin/javac
java full version "1.6.0_21-b06"

/usr/ccs/bin/as
as: Sun Compiler Common 12 SunOS_i386 snv_121 08/03/2009

/usr/ccs/bin/ld
ld: Software Generation Utilities - Solaris Link Editors: 5.11-1.1726

Build project:  group.staff
Build taskid:   169

==== Nightly argument issues ====


==== Build version ====

zfs-vdev

==== Make clobber ERRORS ====


==== Make tools clobber ERRORS ====


==== Tools build errors ====


==== Build errors (DEBUG) ====


==== Build warnings (DEBUG) ====


==== Elapsed build time (DEBUG) ====

real  1:16:26.6
user  6:41:40.6
sys   1:21:38.0

==== Build noise differences (DEBUG) ====


==== package build errors (DEBUG) ====


==== Validating manifests against proto area ====

Entries present in proto area but not manifests:
	dir group=group mode=0755 owner=owner
path=usr/lib/locale/POSIX/LC_COLLATE
	dir group=group mode=0755 owner=owner
path=usr/lib/locale/POSIX/LC_CTYPE
	dir group=group mode=0755 owner=owner
path=usr/lib/locale/POSIX/LC_MESSAGES
	dir group=group mode=0755 owner=owner
path=usr/lib/locale/POSIX/LC_MONETARY
	dir group=group mode=0755 owner=owner
path=usr/lib/locale/POSIX/LC_NUMERIC
	dir group=group mode=0755 owner=owner path=usr/lib/locale/POSIX/LC_TIME
	file usr/lib/locale/POSIX/locale_description group=group mode=0444
owner=owner path=usr/lib/locale/POSIX/locale_description

Entries that differ between manifests and proto area:
     manifests hardlink path=kernel/strmod/amd64/sdpib
target=kernel/drv/amd64/sdpib
    proto area file kernel/strmod/amd64/sdpib group=group mode=0755
owner=owner path=kernel/strmod/amd64/sdpib
     manifests hardlink path=kernel/strmod/sdpib target=kernel/drv/sdpib
    proto area file kernel/strmod/sdpib group=group mode=0755
owner=owner path=kernel/strmod/sdpib
     manifests hardlink
path=platform/i86pc/kernel/cpu/amd64/cpu_ms.GenuineIntel.6.47
target=platform/i86pc/kernel/cpu/amd64/cpu_ms.GenuineIntel.6.46
    proto area file
platform/i86pc/kernel/cpu/amd64/cpu_ms.GenuineIntel.6.47 group=group
mode=0755 owner=owner
path=platform/i86pc/kernel/cpu/amd64/cpu_ms.GenuineIntel.6.47
     manifests hardlink
path=platform/i86pc/kernel/cpu/cpu_ms.GenuineIntel.6.47
target=platform/i86pc/kernel/cpu/cpu_ms.GenuineIntel.6.46
    proto area file platform/i86pc/kernel/cpu/cpu_ms.GenuineIntel.6.47
group=group mode=0755 owner=owner
path=platform/i86pc/kernel/cpu/cpu_ms.GenuineIntel.6.47
     manifests link path=usr/lib/fwflash/verify/ses-LSILOGIC.so
target=ses-SUN.so
    proto area file usr/lib/fwflash/verify/ses-LSILOGIC.so group=group
mode=0755 owner=owner path=usr/lib/fwflash/verify/ses-LSILOGIC.so
     manifests link path=usr/lib/fwflash/verify/sgen-LSILOGIC.so
target=ses-SUN.so
    proto area file usr/lib/fwflash/verify/sgen-LSILOGIC.so group=group
mode=0755 owner=owner path=usr/lib/fwflash/verify/sgen-LSILOGIC.so
     manifests link path=usr/lib/fwflash/verify/sgen-SUN.so
target=ses-SUN.so
    proto area file usr/lib/fwflash/verify/sgen-SUN.so group=group
mode=0755 owner=owner path=usr/lib/fwflash/verify/sgen-SUN.so
     manifests link path=usr/lib/locale/POSIX target=C
    proto area dir group=group mode=0755 owner=owner
path=usr/lib/locale/POSIX


==== Check ELF runtime attributes ====


==== Diff ELF runtime attributes (since last build) ====


==== 'dmake lint' of src ERRORS ====


==== Elapsed time of 'dmake lint' of src ====

real    35:58.2
user  1:35:25.6
sys   1:25:37.6

==== lint warnings src ====


==== lint noise differences src ====


==== cstyle/hdrchk errors ====


==== Find core files ====


==== Check lists of files ====


==== Impact on file permissions ====



diff -r af0a1d7f121d -r 5211e1f2192b
usr/src/uts/common/fs/zfs/vdev_cache.c
--- a/usr/src/uts/common/fs/zfs/vdev_cache.c	Wed Apr 20 19:50:50 2011
-0400
+++ b/usr/src/uts/common/fs/zfs/vdev_cache.c	Thu Apr 21 21:34:04 2011
-0700
@@ -71,9 +71,16 @@
  * 1<<zfs_vdev_cache_bshift byte reads by the vdev_cache (aka software
  * track buffer).  At most zfs_vdev_cache_size bytes will be kept in
each
  * vdev's vdev_cache.
+ *
+ * TODO: Note that with the current ZFS code, it turns out that the
+ * vdev cache is not helpful, and in some cases actually harmful.  It
+ * is better if we disable this.  Once some time has passed, we should
+ * actually remove this to simplify the code.  For now we just disable
+ * it by setting the zfs_vdev_cache_size to zero.  Note that Solaris 11
+ * has made these same changes.
  */
 int zfs_vdev_cache_max = 1<<14;			/* 16KB */
-int zfs_vdev_cache_size = 10ULL << 20;		/* 10MB */
+int zfs_vdev_cache_size = 0;
 int zfs_vdev_cache_bshift = 16;
 
 #define	VCBS (1 << zfs_vdev_cache_bshift)	/* 64KB */





More information about the Advocates mailing list