LWP ends up with unreadable characters

i'm trying to use LWP on:

http://education.yahoo.com/reference/dict_en_es/spanish/a_1; _ylt=
AoFfUtrOQo3d1vl10ohvPPb2s8sF

when i do :

$ua = LWP::UserAgent->new;
$res1 = $ua->get($url,%header);
my $page = $res1->content;

'$page' ends up with unreadable characters. the code works fine for
most sites. also, if i fetch the page with 'lynx' i get readable stuff,
and a browser's 'view source' function on the page gets a normal
result.

ideas?

tom arnall
north spit, ca
tom arnall [ Mo, 24 April 2006 00:25 ] [ ID #1289330 ]

Re: LWP ends up with unreadable characters

kloro [at] cox.net writes:

> i'm trying to use LWP on:
>
> http://education.yahoo.com/reference/dict_en_es/spanish/a_1; _ylt=
> AoFfUtrOQo3d1vl10ohvPPb2s8sF
>
> when i do :
>
> $ua = LWP::UserAgent->new;
> $res1 = $ua->get($url,%header);
> my $page = $res1->content;
>
> '$page' ends up with unreadable characters. the code works fine for
> most sites. also, if i fetch the page with 'lynx' i get readable stuff,
> and a browser's 'view source' function on the page gets a normal
> result.
>
> ideas?

Try to provide a complete program that we can run to reproduce your
problem. I certainly get text out when I try to access your URL with
this program:

#!/usr/bin/perl -w

use strict;
use LWP::UserAgent;

my $ua = LWP::UserAgent->new;
my $res = $ua->get('http://education.yahoo.com/reference/dict_en_es/sp anish/a_1;_ylt=AoFfUtrOQo3d1vl10ohvPPb2s8sF');
my $page = $res->content;

print $page;
__END__

Perhaps you have something interesting in %header that you don't tell us about?

Regards,
Gisle
gisle [ Di, 25 April 2006 16:17 ] [ ID #1289336 ]

Re: LWP ends up with unreadable characters

On Tuesday 25 April 2006 07:17 am, you wrote:
> kloro [at] cox.net writes:
> > i'm trying to use LWP on:
> >
> > http://education.yahoo.com/reference/dict_en_es/spanish/a_1; _ylt=3D
> > AoFfUtrOQo3d1vl10ohvPPb2s8sF
> >
> > when i do :
> >
> > $ua =3D LWP::UserAgent->new;
> > $res1 =3D $ua->get($url,%header);
> > my $page =3D $res1->content;
> >
> > '$page' ends up with unreadable characters. the code works fine for
> > most sites. also, if i fetch the page with 'lynx' i get readable stuff,
> > and a browser's 'view source' function on the page gets a normal
> > result.
> >
> > ideas?
>
> Try to provide a complete program that we can run to reproduce your
> problem. I certainly get text out when I try to access your URL with
> this program:
>
> #!/usr/bin/perl -w
>
> use strict;
> use LWP::UserAgent;
>
> my $ua =3D LWP::UserAgent->new;
> my $res =3D
> $ua->get('http://education.yahoo.com/reference/dict_en_es/sp anish/a_1;_yl=
t=3D
>AoFfUtrOQo3d1vl10ohvPPb2s8sF'); my $page =3D $res->content;
>
> print $page;
> __END__
>
> Perhaps you have something interesting in %header that you don't tell us
> about?
>

thanks everyone for your responses. and indeed it has to do with the '%head=
er'
statement, which runs:

my %header =3D (
=A0 =A0 'Keep-Alive' =3D> '300',
=A0 =A0 'Connection' =3D> 'keep-alive',
=A0 =A0 'User-Agent' =3D> 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.=
10)
=A0 =A0 =A0 =A0 Gecko/20050925 Firefox/1.0.4 (Debian package 1.0.4-2sarge5=
)',
=A0 =A0 'Pragma' =3D> 'no-cache',
=A0 =A0 'Cache-control' =3D> 'no-cache',
=A0 =A0 'Accept' =3D> 'image/png,*/*;q=3D0.5',
=A0 =A0 'Accept-Encoding' =3D> 'gzip,deflate',
=A0 =A0 'Accept-Charset' =3D> 'ISO-8859-1,utf-8;q=3D0.7,*;q=3D0.7',
=A0 =A0 'Accept-Language' =3D> 'en-us,en;q=3D0.5',
=A0 =A0 'Host' =3D> $host,
=A0 =A0 );
=09
if " 'Accept-Encoding' =3D> 'gzip,deflate' " is eliminated, the subsequ=
ent
fetch on the website is normal ascii.

tom arnall
north spit, ca
kloro [ Fr, 28 April 2006 06:02 ] [ ID #1294139 ]
Perl » perl.libwww » LWP ends up with unreadable characters

Vorheriges Thema: Parsing q-values
Nächstes Thema: LWP produces unreadable characters