[illumos-Developer] tr webrev

Joerg Schilling Joerg.Schilling at fokus.fraunhofer.de
Sun Sep 5 16:04:17 PDT 2010


John Grafton <john.grafton at gmail.com> wrote:

> Cool, that settles it, then.  I'll work on merging the two and ask for help
> if the issues prove murkier than expected.

First a note to the webrev:

-        switch (argc) {
+        switch(argc) {

is a change that breaks basic "cstyle" compatibility. Could you please run 
cstyle on the source?

The main differences between *BSD indentation rules and the "Bill Joy 
normalform" is that logical operators at *BSD are at the beginning of 
continuation lines, so it is simple to pass cstyle if you start with a *BSD 
source.


Am I correllty asuming that the webrev shows differences between the current 
tr(1) in Illumos and your version?


Let me first repeat my comclusions from testing the FreeBSD tr after looking at 
the bug 132:

tr -c string1		does not work correctly
tr -C string1		does not work correctly

This unveils a problem: there has not been enough testing before integrating 
the tr(1) OSS replacement. Let me give answers to this problem later...


Let us put aside the problem with tr -C string1, as there is no current problem.

You tried to fix the problem with "tr -c string1" by going back to an older tr 
implementation that does not support multi byte chars at all. This seems to be 
the wrong method to work on our problem.

Please rather start again with the FreeBSD tr as this implementation supports 
multi byte characters already.

It seems to be not too hard to implement tr -c (the single byte char inverted 
list) as there are only 256 values to look at.

We need to find a solution for tr -C string1 later....

OK, back to testing.

We currently have no testing standards and if we don't work on this problem, 
this is a problem that will increase over time. An obvious way to work on this 
problem seems to be to create automated tests for all problems we did fix. This 
could at least prevent the same problems from being introduced later again.

In general, we should classify types of integrations and do testing in a way that 
fits the problem.

Let me list some possible classes:

problem not understood		We need to integrate stuff for a problem that
				is not fully understood


problem understood		We need to integrate stuff for a problem that
				is fully understood because the cause for the
				problem is well known and because the way to 
				fix is well known

big chunk of new software	This is a big chunk of new software that was
				not in Illumos before and that did not run
				on Solaris before

big chunk of native software	This is a big chunk of software that was not
				in Illumos before, but that did run on Solaris
				already

delta to existing software	This is a delta to software that did work 
				previously in Illumos without bugs or with
				smaller bugs

delta to integrating software	This is a delta to software that was previously
				integrated in Illumos but that did not work
				sufficiently

Let me make some examples:

The fix for bug 121 is "problem understood" + "delta to existing software"
This is an extremely simple case.

Integrating star is "problem understood" + "big chunk of native software"
This is also a simple case, as the software is known to work on Solaris and 
we only need to check the new Makefiles.

Bug 132 is "problem not understood" + "big chunk of new software"
This is a hard case that introduces instabilty.
The reason for the hard case is that tr depends on i18n in libc that
was recently replaced and that has not yet been tested sufficiently.
In addition, the FreeBSD tr did never run on Solaris before.

For the last case, we could in theory use the POSIX compatibility test suite, 
but this is someting I need to ask for at the OpenGroup and I have been told 
that we can only use it for free for one year only. For this reason, I decided 
to wait until we have the software to test available already on Illumos.

In general, we need to find ways to do better testing for the "problem not 
understood" + "big chunk of new software" case. 

If I did e.g. know that tr has not been tested before integration, I did try to 
test at least simple cases of all use cases descibed in the POSIX standard. So 
I did actally only test some simple cases with non US-ASCII chars as this is 
usually forgotten by US people. 

A bit more openness in the communication between Illumos developers would help 
a lot.

Jörg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
       js at cs.tu-berlin.de                (uni)  
       joerg.schilling at fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily



More information about the Developer mailing list