Scripting problem when launching links using mech.

------=_NextPart_000_021F_01CC019D.18BCEE70
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Hi All.

I have a script that goes to google. Performs a search. I have a regular =
expression that gets all the links. The first link in the array I return =
is ment to open the next page. But it keeps failing with an error =
stating it is illegal page. When I try and use lwp, it says you cannot =
use an absolute link. But the link looks fine to myself.

Below is the code, any help on resolving this issue would be great. The =
goal of the script is to input a movie name. find the details of the =
movie from imbd.com. Then save it as a text file. Eventually this =
Information will be turned into a web page.

So when I want to watch a movie at home. I can identify the movie =
rating, what it is about, etc.

This is a Work in progress script.

#!/usr/bin/perl

# finding description and category for movies.

# Initialise moduels.
use strict;
use warnings;
use URI;
use LWP::UserAgent;
use HTML::TreeBuilder 3; # make sure our version isn't ancient
use WWW::Mechanize;

my $dirname =3D 'T:\\new movies';
my [at] movies;
my $url =3D 'http://www.google.com.au';


open (FP, "<", "t:\\movies.txt") || die "could not find";
my $m =3D WWW::Mechanize->new();
$m->get($url);

while (<FP>) {
chomp;
#print "file: $_\n";

$m->form_name('f');
$m->field('q', ('imdb.com: ' . $_));
my $response =3D $m->submit();

my [at] links =3D $m->find_all_links( url_regex =3D> =
qr/www\.imdb\.com\/title\// );

my $link =3D $links[0]->url;
my $text =3D $links[0]->text;
print "$_\t$text\t$link\n";
my $browser =3D LWP::UserAgent->new( );
my $response =3D $browser->get($link);
die "Hmm, error \"", $response->status_line( ),
my $content =3D $response->content( );

}

Note, I have used $m->get ($link); as well with no success and even the =
$m->follow-link method.

Sean
------=_NextPart_000_021F_01CC019D.18BCEE70--
Sean Murphy [ Sa, 23 April 2011 01:59 ] [ ID #2058612 ]

Re: Scripting problem when launching links using mech.

Hi Sean,

On Saturday 23 Apr 2011 02:59:12 Sean Murphy wrote:
> Hi All.
>
> I have a script that goes to google. Performs a search.

Google does not allow you to do that using WWW-Mechanize and LWP-UserAgent
(legally). You should use their web-search API and:

http://search.cpan.org/dist/Google-Search/

> I have a regular
> expression that gets all the links.

You shouldn't parse HTML with regular expressions:

http://www.codinghorror.com/blog/2009/11/parsing-html-the-ct hulhu-way.html

> The first link in the array I return
> is ment to open the next page. But it keeps failing with an error stating
> it is illegal page. When I try and use lwp, it says you cannot use an
> absolute link. But the link looks fine to myself.
>
> Below is the code, any help on resolving this issue would be great. The
> goal of the script is to input a movie name. find the details of the movie
> from imbd.com. Then save it as a text file. Eventually this Information
> will be turned into a web page.
>

Please re-implement the code with the Google Search API, and peace will come
upon the land.

Regards,

Shlomi Fish

--
------------------------------------------------------------ -----
Shlomi Fish http://www.shlomifish.org/
"Star Trek: We, the Living Dead" - http://shlom.in/st-wtld

Chuck Norris is the ghost author of the entire Debian GNU/Linux distribution.
And he wrote it in 24 hours, while taking snack breaks.

Please reply to list if it's a mailing list post - http://shlom.in/reply .

--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
Shlomi Fish [ Sa, 23 April 2011 08:54 ] [ ID #2058622 ]

Re: Scripting problem when launching links using mech.

HI Shlomi

Thanks, I will check that out. Peace on the land is the best of course. :-)

Sean
----- Original Message -----
From: "Shlomi Fish" <shlomif [at] iglu.org.il>
To: <beginners [at] perl.org>
Cc: "Sean Murphy" <mhysnm1964 [at] gmail.com>
Sent: Saturday, April 23, 2011 4:54 PM
Subject: Re: Scripting problem when launching links using mech.


> Hi Sean,
>
> On Saturday 23 Apr 2011 02:59:12 Sean Murphy wrote:
>> Hi All.
>>
>> I have a script that goes to google. Performs a search.
>
> Google does not allow you to do that using WWW-Mechanize and LWP-UserAgent
> (legally). You should use their web-search API and:
>
> http://search.cpan.org/dist/Google-Search/
>
>> I have a regular
>> expression that gets all the links.
>
> You shouldn't parse HTML with regular expressions:
>
> http://www.codinghorror.com/blog/2009/11/parsing-html-the-ct hulhu-way.html
>
>> The first link in the array I return
>> is ment to open the next page. But it keeps failing with an error stating
>> it is illegal page. When I try and use lwp, it says you cannot use an
>> absolute link. But the link looks fine to myself.
>>
>> Below is the code, any help on resolving this issue would be great. The
>> goal of the script is to input a movie name. find the details of the
>> movie
>> from imbd.com. Then save it as a text file. Eventually this Information
>> will be turned into a web page.
>>
>
> Please re-implement the code with the Google Search API, and peace will
> come
> upon the land.
>
> Regards,
>
> Shlomi Fish
>
> --
> ------------------------------------------------------------ -----
> Shlomi Fish http://www.shlomifish.org/
> "Star Trek: We, the Living Dead" - http://shlom.in/st-wtld
>
> Chuck Norris is the ghost author of the entire Debian GNU/Linux
> distribution.
> And he wrote it in 24 hours, while taking snack breaks.
>
> Please reply to list if it's a mailing list post - http://shlom.in/reply .
>


--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
Sean Murphy [ Sa, 23 April 2011 10:03 ] [ ID #2058624 ]
Perl » gmane.comp.lang.perl.beginners » Scripting problem when launching links using mech.

Vorheriges Thema: parse complicated table
Nächstes Thema: Embedding Perl in HTML