[illumos-Developer] Webrev for bug 1155: in.ftpd sometimesstarts with SIGALRM blocked

Garrett D'Amore garrett at nexenta.com
Thu Jul 21 12:34:10 PDT 2011


poll() doesn't affect (or shouldn't affect) the SIGALRM handling.  But usleep() is a famous system utility that *does* alter teh signal mask.  Are there any instances of this?  The timer queues might do this as well.

  -- Garrett D'Amore

On Jul 21, 2011, at 10:59 AM, "Gary Mills" <mills at cc.umanitoba.ca> wrote:

> On Thu, Jul 21, 2011 at 09:07:08AM -0700, Eric Schrock wrote:
>> 
>>   On Thu, Jul 21, 2011 at 9:00 AM, Gordon Ross
>>   <[1]gordon.w.ross at gmail.com> wrote:
>> 
>>     Yes, but I'm nervous about giving an "OK" to a fix when
>>     there is not a clear root cause analysis in the issue.
>>     Is this a band-aid or a real fix?
> 
>>   Same here.  The fix looks fine but it makes me nervous to mask what
>>   may be a pervasive problem elsewhere in the system (i.e. some library
>>   function that is messing with signals in a way that could disrupt
>>   execution of other critical services).  How reproducible is this for
>>   you?  If it's remotely reproducible, we should cook up a D script that
>>   watches any signal disposition changes inetd and logs the stack
>>   somewhere.  I'm happy to work with you on that offline.
> 
> There are two aspects to this issue.  One is to initialize signal
> handling in the way that the new child process expects.  This includes
> signal handlers set to their defaults (done explicitly by inetd), no
> pending signals (done by fork()), and a clear signal mask (done by my
> addition).  It's not entirely a bandage.
> 
> The other aspect is the root cause, of course.  I discovered this
> problem on a moderately busy FTP server running under Solaris 10
> because in.ftpd processes would slowly accumulate.  I added some code
> to in.ftpd that wrote a log entry whenever it started up with SIGALRM
> blocked.  At times, it happened every five to twenty minutes of normal
> user activity.  I was never able to provoke it myself, but it did
> happen occasionally in production.  I don't know the conditions that
> result in this signal being blocked.
> 
> The inetd code does not include either SIGALRM or alarm().  It does,
> however, use a timeout in poll() and a timer queue.  To observe this
> problem, without knowing the cause in advance, you'd need an Illumos
> system with a variety of active services run by inetd.  I don't have
> such a system.
> 
> -- 
> -Gary Mills-        -Unix Group-        -Computer and Network Services-
> 
> _______________________________________________
> Developer mailing list
> Developer at lists.illumos.org
> http://lists.illumos.org/m/listinfo/developer



More information about the Developer mailing list