How to handle the Redirected using Scrappy Module

I am try to crawl a webpage that one is redirected to another.
I am using Scrappy module for crawling process.
I am using version 0.94111370 (Updated version).
Any one suggest me to handle the Redirect.

thank you,
Muthukumaraswamy.C (Ambuli)


--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
muthukumar swamy [ Di, 24 Mai 2011 15:25 ] [ ID #2060009 ]

Re: How to handle the Redirected using Scrappy Module

<44b8cc1f-bef8-494a-ad15-45e79d3ef43a [at] s16g2000prf.googlegroups.com>

On Tue, May 24, 2011 at 06:25:53 -0700 , Ambuli wrote:
> I am try to crawl a webpage that one is redirected to another.
> I am using Scrappy module for crawling process.
> I am using version 0.94111370 (Updated version).
> Any one suggest me to handle the Redirect.

What do you mean by 'handle the Redirect'? Your message isn't clear.

--
Chris Nehren | Coder, Sysadmin, Masochist
Shadowcat Systems Ltd. | http://shadowcat.co.uk/

--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
Chris Nehren [ Di, 31 Mai 2011 11:14 ] [ ID #2060308 ]

Re: How to handle the Redirected using Scrappy Module

Hi Chris Nehren,
I show my code to clear my thought.

my $scraper = Scrappy->new;
$new_url="Some Url";
$scraper->get($new_url)
if ($scraper->page_status == 302)
{
# Here i want to get the redirect Location
}

Give some suggestion for me


--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
muthukumar swamy [ Di, 31 Mai 2011 16:02 ] [ ID #2060315 ]

Re: How to handle the Redirected using Scrappy Module

On Tue, May 24, 2011 at 06:25:53 -0700 , Ambuli wrote:
> I am try to crawl a webpage that one is redirected to another.
> I am using Scrappy module for crawling process.
> I am using version 0.94111370 (Updated version).
> Any one suggest me to handle the Redirect.

What do you mean by "handle the Redirect"? I'm afraid your question
isn't clear.

--
Chris Nehren | Coder, Sysadmin, Masochist
Shadowcat Systems Ltd. | http://shadowcat.co.uk/

--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
Chris Nehren [ Di, 31 Mai 2011 11:05 ] [ ID #2060351 ]

Re: How to handle the Redirected using Scrappy Module

On Tue, May 31, 2011 at 05:05, Chris Nehren
<c.nehren/beginners [at] shadowcat.co.uk> wrote:
> On Tue, May 24, 2011 at 06:25:53 -0700 , Ambuli wrote:
>> I am try to crawl a webpage that one is redirected to another.
>> I am using Scrappy module for crawling process.
>> I am using version 0.94111370 (Updated version).
>> Any one suggest me to handle the Redirect.
>
> What do you mean by "handle the Redirect"? I'm afraid your question
> isn't clear.
>

I'm assuming that the OP wants to know whether the web request was
redirected via a 301 or a 302...

It looks like Scrappy handles such redirects transparently, but
provides the 'request_denied' method as a flag that can be checked.
Here's some sample code that uses a page on one of my domains that
gives a 301:


--cut--
#! /opt/perl/bin/perl

use strict;
use warnings;
use 5.010;

use Scrappy;

my $s = Scrappy->new;
$s->get( 'http://genehack.org/about' );

say "Status: ",$s->page_status;
say "Denied: ",$s->request_denied;

my [at] redirects = $s->response->redirects;
say "Original URL: ", $redirects[0]->request->url;
say "Fetched URL: ",$s->response->request->url;
--cut--

Running this produces:

$ ./try.pl
Status: 200
Denied: 1
Original URL: http://genehack.org/about
Fetched URL: http://genehack.net/about/

As you can see, the status code is reported as a 200, even though
there was a redirect done.

The 'request' method on the Scrappy object returns an HTTP::Response
object. You should read the documentation for that module to
understand what the last several lines in my script are doing. You'll
need to understand that in order to be able to reliably detect
redirects yourself.

chrs,
john.

--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
John SJ Anderson [ Mi, 01 Juni 2011 03:41 ] [ ID #2060352 ]
Perl » gmane.comp.lang.perl.beginners » How to handle the Redirected using Scrappy Module

Vorheriges Thema: Ten Years of Considerate Help
Nächstes Thema: Conversion program decimal to hex and oct using assertions