[tools] lftp hard timeout

Lamont Granquist lamont at scriptkiddie.org
Mon Aug 27 21:12:24 CEST 2007



On Mon, 27 Aug 2007, Dag Wieers wrote:
> On Sun, 26 Aug 2007, Lamont Granquist wrote:
>> The net:timeout in lftp has not been working for me (RHEL5, lftp-3.5.1-2.fc6,
>> net:timeout set to 60, lftp session hanging for 3 days...) so i tried to wrap
>> lftp in a SIGALRM timeout.  The attatched patch is for a hard coded 1 hr
>> timeout.  This is my first attempt at writing python code, so I could use some
>> review (other than the obvious hardcoding of 1 hr)
>
> I think this should be reported to the lftp mailinglist, rather than work
> around in mrepo. The patch in itself looks fine, but if it is a bug in
> lftp (or the ftp server) and the behaviour is different from one would
> expect, mrepo is really the wrong location to fix it IMO.

Well, this isn't a "fix".  I consider it defensive programming, though.

My view is that if lftp has a bug that causes one instance of mrepo to 
hang forever I want notification of that, but I want mrepo to continue and 
hopefully tomorrow's run doesn't have this issue.  That keeps my overall 
architecture running smoothly and I can track down the lftp issue when I 
get the free time to do so.  The current behavior is annoying because I've 
got a completely broken architecture until I track down and kill the lftp 
bug...  Timing out operations and trying as best to recover automatically 
is generally good practice (as long as there's no side effects like 
storms, but that shouldn't be an issue in this case...)

> Though the script looks OK from a functional POV. I don't think I would
> implement it any other way.

Doesn't seem to actually work, though.  I've got an lftp now that has been 
running for more than 12 hours that mrepo is blocking on.  The timeout 
also needs to forcefully kill the spawned off lftp since I've also got an 
lftp which is spinning at 100% cpu and is ppid=1.


More information about the tools mailing list