LWP timeout issue

LWP timeout issue

am 14.03.2005 17:58:02 von xsheng

Hi all,

Have a quick question, I am using the LWP to visit a list of web
pages. Some of them are not available either due to bad hostname or time
out, or internal server error. My question is how to distinguish them, for
example I want to know how many of my requests failed because the uri was
incorrect.

I also want to know how many of my request was failed because the server
was unavailable (timed out).

It seems to me that the LWP will simply return code 500 for all of them.

Thanks,

Steve

Re: LWP timeout issue

am 15.03.2005 16:10:55 von christophe

Steve Sheng a =E9crit :
> I also want to know how many of my request was failed because the=20
> server was unavailable (timed out).

About the timeout problems in LWP (perhaps a little off-topic?)

With unreachable hosts, LWP timeouts dont work (ex DNS problem).

Here is all relevant information I could gather, as I encounter
that problem too (a single LWP request took several minutes,
even when setting a 10 seconds timeout). See references and
excerpts below.

To summarize, here are the underlying problems, as I understand them

1. &alarm() and signal handling dont work (well?) on win32
2. IO::Socket Linux Bug -- used &alarm(): upgrade to IO >=3D 1.20
3. LWP timeout start after making tcp connection (after DNS timeouts)

Excerpts
--------

From [0]

$ua->timeout
$ua->timeout( $secs )
Get/set the timeout value in seconds.
The default timeout() value is 180 seconds, i.e. 3 minutes.

The requests is aborted if no activity on the connection
to the server is observed for "timeout" seconds.
This means that the time it takes for the complete
transaction and the request() method to actually return
might be longer.

From [1]

"This is a known bug in IO::Socket on Linux. If you upgrade your
IO-modules things should get better. IO-1.20 or better does not use
alarm to try to timeout connect(2)."

From [2]

"LWP relies on IO::Socket to do the connect timeout. Once it is
connected it will do its own timeout before each read/write."

From [3]

"It's an issue with DNS timeout in IO::Socket::INET, apparently."

From [4]

"The timeout may seem to NOT work, if the host you're trying to connect=20
is unreachable"

From [5]

"...if the host is unreachable, the timeout doesn't seems to work...
Or maybe the time out is only used once the TCP connections is done..."
"Under the covers, LWP uses SIGALRM (in IO::Socket) for timeouts.=20
Unfortunately, signals aren't supported on Win32."

From [6]

alarm SECONDS
alarm Not implemented. (Win32)

From [7]:

"Signal handling may not behave as on Unix platforms (where
it doesn't exactly "behave", either :). For instance,
calling "die()" or "exit()" from signal handlers will
cause an exception, since most implementations of "signal()"
on Win32 are severely crippled. Thus, signals may work
only for simple things like setting a flag variable
in the handler. Using signals under this port should
currently be considered unsupported."

References
----------

[0] LWP manual page
LWP::UserAgent(3pm)

[1] Re: LWP::UserAgent timeout / 29 Nov 1999
http://www.nntp.perl.org/group/perl.libwww/222

[2] Re: LWP and timeouts / 15 Oct 2003
http://www.nntp.perl.org/group/perl.libwww/5136

[3] Re: LWP and timeouts / 21 Oct 2003
http://www.nntp.perl.org/group/perl.libwww/5176

[4] LWP timeout / Apr 02, 2001
http://www.perlmonks.org/index.pl?node_id=3D68995

[5] LWP - timeout / Jan 26, 2001
http://www.perlmonks.org/index.pl?node_id=3D54528

[6] Perl 5.6.1 manual (5.8.4 is different about &alarm)
perlport (1)

[7] Perl 5.6.1 manual (5.8.4 is same about signals)
perlwin32(1)

Re: LWP timeout issue

am 15.03.2005 16:30:34 von christophe

Steve Sheng a =E9crit :
> pages. Some of them are not available either due to bad hostname or tim=
e=20
> out, or internal server error. My question is how to distinguish them,=20

my $resp =3D $ua->get($url);
then check the $resp->status_line. Example values:

500 Can't connect to www...com:80 (Bad hostname 'www...com')
500 Can't connect to ...com:80 (connect: timeout)

> for example I want to know how many of my requests failed because the=20
> uri was incorrect.

If you want to search for HTTP 404 Errors:

my $code =3D $resp->code();
then check $code value. Ex 400<=3D$code<=3D499 (man HTTP::Response)
or if ($error == RC_NOT_FOUND)... (man HTTP::Status)
or if (is_client_error($code))...

> It seems to me that the LWP will simply return code 500 for all of them=
..

5xx HTTP code means 'Server error'. It seems LWP uses '500'
when it cant reach the host. But whole status line is more verbose,
if you need more information.

Ch.