[illumos-Developer] webrev: reimplementation of od

Garrett D'Amore garrett at damore.org
Mon Oct 18 21:00:33 PDT 2010


In order to answer some feedback questions, and address some bugs that
were pointed out in review, I've changed the implementation somewhat.

The new webrev is here: http://pastebin.ca/1966514

Thea diffs from the previous (yesterday's) version of the code are here:
http://pastebin.ca/1966521

The new logic should be simpler to read, since it just uses a circular
FIFO instead of three separate buffers.  It uses power of two math for
efficiency.

I've tested this as reasonably fully as I think I can.  I've tested
large files (over 4GB DVD images), and I've compared results with Sun's
od.  Its bit-for-bit perfectly matched by default with Sun's.

Most other formats are also bit for bit perfect.  One problem I did
discover was a bug in Sun's handling of -C output.  (Closed od sometimes
puts out "000" instead of "\0", which by definition is *always* wrong
for UTF8 since "000" can't never be part of a valid UTF-8 sequence other
than \0.)

Notably, this code runs in about half the time of xpg4 with the default
options.  So this is a nice performance win as well.

	- Garrett

Original xpg4 version:

garrett at master{26}> /bin/time od /data/isos/sol-nv-b105-x86-dvd.iso |
md5sum

real    15:17.3
user    14:56.3
sys        18.0
495ac766a109dc3d90330ea07aa0870e  -


My new version:

garrett at master{31}> /bin/time ~/od /data/isos/sol-nv-b105-x86-dvd.iso |
md5sum

real     7:17.3
user     6:58.9
sys        17.3
495ac766a109dc3d90330ea07aa0870e  -

Also, here's the actual file size:

garrett at master{32}> ls -la /data/isos/sol-nv-b105-x86-dvd.iso
-rw-r--r--   1 garrett  staff    3406561280 Jan 16
2009 /data/isos/sol-nv-b105-x86-dvd.iso

Note, that my "od" was built without optimization.  I expect to see a
further performance boost (about 10% probably) with optimization turned
on.

So I'll wait for further feedback, but unless someone can give me a
concrete example of where my implementation is deficient or buggy
(provide information, please -- "its broken" or "it doesn't conform to
specs" are useless comments without supporting information!), I'll be
moving forward with integration tomorrow.

	- Garrett

On Mon, 2010-10-18 at 10:25 -0700, Garrett D'Amore wrote:
> On Mon, 2010-10-18 at 18:42 +0200, Joerg Schilling wrote:
> > "Garrett D'Amore" <garrett at nexenta.com> wrote:
> > 
> > > So I couldn't find a suitable "od".  GNU is GPL and not POSIX compliant
> > > to boot.  (At least from my tests.)  And the legacy Solaris one is --
> > > startlingly, amazingly bad.  (If anyone can tell me what od -tfD
> > > *thinks* its doing when long doubles are 12-bytes wide, I'd love to hear
> > > an explanation.  I think its just generating random garbage somehow.)
> > 
> > There is my hdump(1) that could be used as a mature base framework.
> > 
> > > Anyway, the new code I wrote is here: 
> > >
> > > http://pastebin.ca/1964590
> > >
> > > Its not in an ON workspace yet.
> > >
> > > I've reasonably tested this as best as I can.  It performs on par with
> > > legacy Sun od, and for some interesting cases runs even faster.  (od -c
> > > in particular.)
> > ...
> > 
> > > Assuming I get some good reviews of this, I'll probably integrate it on
> > > Monday.  It compiles cleanly, so its simple to just d/l and compile if
> > > you want to try it yourself. :-)
> > 
> > -	Does not include largefile support
> 
> Wrong.  Its related to my next point.  I explicitly support large files
> directly, and do not require a compiler option to enable it. 
> 
> > -	Uses hidden intefaces not intended for the public
> 
> You must mean my of *64.  I'm not sure if they were intended for public
> consumption or not, but I had a specific reason in choosing this
> approach.  I wanted to be guaranteed that the code was 64-bit clean, and
> not dependent on compiler options, specifically my use of strtoll() in
> argument parsing was a concern.
> 
> At the end of the day, those interfaces are guaranteed to be safe,
> because they get baked into binaries compiled with -D_LARGEFILE_SOURCE,
> so I'm not worried about my explicit usage, and use of internal
> interfaces is always safe for a program targetting ON anyway.
> 
> However, if you're truly offended by this usage, I'll convert to use
> off_t and -D_LARGEFILE_SOURCE.
> 
> > -	Does not suppot the X-Open CLI
> 
> How so?  I implemented my version *to* the Open Group's specification.
> If there is a discrepancy here, I'd like to fix it.
> 
> > -	Does not give the same default output as od(1)
> 
> Whoops!  Easily (trivially) fixed.
> 
> > -	Occasionally prints incorrect output
> 
> Can you give me an example test case?  I wasn't able to find any
> breakages in my testing, so I'm very interested to see a busted example.
> 
> > -	Occacionally prints zeroes where non-zero data is known to be
> 
> Example test case please?  My tests did not show this, but I'll allow
> that there could be a latent bug here.
> 
> > -	Does not call mbtowc() call in a standard compliant way.
> 
> What are you talking about?  My calls to mbtowc are exactly as specified
> by the standard.
> 
> > 
> > -> not ready for integration.
> 
> I agree with your statement about the default output, which is a one
> line (really one *character*) change that I'll make right now.  The
> other issues you've raised need further explanation.
> 
> > How about enhancing hdump(1) with the od(1) interface?
> 
> I've written this code already, and IMO its further along.  I'm not
> interested in getting into a debate with you because you're interested
> in your program, and I'm not interested in starting over from scratch to
> with with a codebase with which I'm unfamiliar.  (Also, if your code
> base depends on libschily, that's another larger effort that I'd like to
> avoid.)
> 
> Of course, *you* could have picked this work up yourself.  But you
> didn't.  Apparently you're still irked and won't work on anything else
> until I approve star for integration.  That's *your* choice, not mine.
> 
> 	- Garrett
> 
> 
> _______________________________________________
> Developer mailing list
> Developer at lists.illumos.org
> http://lists.illumos.org/m/listinfo/developer





More information about the Developer mailing list