Dereference Links
I'm trying to dereference the [at] {$links} produced
by WWW::SimpleRobot and am having a heck of
a time getting it done. Can anybody help?
You can see some of the things I have tried
below.
I know I can do this link extraction myself with
LinkExtor, or at least think I can do it, but
I'd like to know how to dereference this script.
Mike Flannigan
#
#
#
#!/usr/local/bin/perl
#
use strict;
use warnings;
use WWW::SimpleRobot;
my $robot = WWW::SimpleRobot->new(
URLS => [ 'http://www.portofhouston.com/' ],
FOLLOW_REGEX => "^http://www.portofhouston.com//",
DEPTH => 1,
TRAVERSAL => 'depth',
VISIT_CALLBACK =>
sub {
my ( $url, $depth, $html, $links ) = [at] _;
my [at] linkder = [at] {$links};
print STDERR "Visiting $url\n\n";
# print STDERR "Depth = $depth\n";
# print STDERR "HTML = $html\n";
# print STDERR "Links = [at] {$links}\n";
# print STDERR "Links = [at] linkder\n";
# foreach ( [at] linkder){
# print STDERR "$_\n";
# }
for (my $num = 0; $num <= $#linkder; $num++) {
print STDERR "$linkder[$num]\n";
}
# for (my $num = 0; $num <= $#linkder; $num++) {
# print STDERR "${$linkder}[$num]\n";
# }
}
,
BROKEN_LINK_CALLBACK =>
sub {
my ( $url, $linked_from, $depth ) = [at] _;
print STDERR "$url looks like a broken link on
$linked_from\n";
print STDERR "Depth = $depth\n";
}
);
$robot->traverse;
my [at] urls = [at] {$robot->urls};
my [at] pages = [at] {$robot->pages};
for my $page ( [at] pages )
{
my $url = $page->{url};
my $depth = $page->{depth};
my $modification_time = $page->{modification_time};
}
print "\nAll done.\n";
__END__
--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
Re: Dereference Links
On Fri, 21 Jan 2011 08:01:03 -0600, Mike Flannigan wrote:
> I'm trying to dereference the [at] {$links} produced by WWW::SimpleRobot and
> am having a heck of a time getting it done. Can anybody help? You can
> see some of the things I have tried below.
That module hasn't been updated since 2001. You'll have a much easier
time using WWW::Mechanize and many more people will be in a position to
help you.
--
Peter Scott
http://www.perlmedic.com/ http://www.perldebugged.com/
http://www.informit.com/store/product.aspx?isbn=0137001274
http://www.oreillyschool.com/courses/perl3/
--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
Re: Dereference Links
--001517510f5899ca1b049a8d1391
Content-Type: text/plain; charset=ISO-8859-1
You also might want to look into Data::Dumper to see exactly what you're
working with.
Off the top though:
map { [at] $_} [at] $arr
might be a start.
--001517510f5899ca1b049a8d1391--
Re: Dereference Links
On 21/01/2011 14:01, Mike Flannigan wrote:
>
> I'm trying to dereference the [at] {$links} produced
> by WWW::SimpleRobot and am having a heck of
> a time getting it done. Can anybody help?
> You can see some of the things I have tried
> below.
>
> I know I can do this link extraction myself with
> LinkExtor, or at least think I can do it, but
> I'd like to know how to dereference this script.
>
>
> Mike Flannigan
>
>
>
> #
> #
> #
> #!/usr/local/bin/perl
> #
> use strict;
> use warnings;
> use WWW::SimpleRobot;
> my $robot = WWW::SimpleRobot->new(
> URLS => [ 'http://www.portofhouston.com/' ],
> FOLLOW_REGEX => "^http://www.portofhouston.com//",
> DEPTH => 1,
> TRAVERSAL => 'depth',
> VISIT_CALLBACK =>
> sub {
> my ( $url, $depth, $html, $links ) = [at] _;
> my [at] linkder = [at] {$links};
> print STDERR "Visiting $url\n\n";
> # print STDERR "Depth = $depth\n";
> # print STDERR "HTML = $html\n";
> # print STDERR "Links = [at] {$links}\n";
> # print STDERR "Links = [at] linkder\n";
> # foreach ( [at] linkder){
> # print STDERR "$_\n";
> # }
> for (my $num = 0; $num <= $#linkder; $num++) {
> print STDERR "$linkder[$num]\n";
> }
> # for (my $num = 0; $num <= $#linkder; $num++) {
> # print STDERR "${$linkder}[$num]\n";
> # }
> }
>
> ,
> BROKEN_LINK_CALLBACK =>
> sub {
> my ( $url, $linked_from, $depth ) = [at] _;
> print STDERR "$url looks like a broken link on
> $linked_from\n";
> print STDERR "Depth = $depth\n";
> }
> );
> $robot->traverse;
> my [at] urls = [at] {$robot->urls};
> my [at] pages = [at] {$robot->pages};
> for my $page ( [at] pages )
> {
> my $url = $page->{url};
> my $depth = $page->{depth};
> my $modification_time = $page->{modification_time};
> }
>
> print "\nAll done.\n";
Hey Mike
What you have written can be fixed by changing it to
for (my $num = 0; $num <= $#linkder; $num++) {
print STDERR " [at] {$linkder[$num]}\n";
}
or even
for (my $num = 0; $num <= $#{$links}; $num++) {
print STDERR " [at] {$links->[$num]}\n";
}
but it is much clear and more Perlish to write
foreach my $link ( [at] $links) {
print STDERR " [at] {$link}\n";
}
Remember: everywhere you could put a simple variable identifier you can
put a reference. Surrounding it in braces is always valid and helps
resolve ambiguity, so [at] linkder is the same as [at] {linkder} is the same as
[at] {$links}. Likewise, $linkder[$num] (or $links->[$num]) is an array
reference, and can be dereferenced with [at] {$linkder[$num]}.
HTH,
Rob
--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
Re: Dereference Links
--------------070003030008040704000905
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
On 1/23/2011 5:21 PM, beginners-digest-help [at] perl.org wrote:
>That module hasn't been updated since 2001. You'll have a >much easier
>time using WWW::Mechanize and many more people will be in>a position to
>help you.
>
>Peter Scott
>
Thank you for the reply.
I appreciate it.
Mike Flannigan
--------------070003030008040704000905--
Re: Dereference Links
On 1/25/2011 6:07 PM, Rob and Shawn wrote:
> Hey Mike
>
> What you have written can be fixed by changing it to
>
> for (my $num = 0; $num <= $#linkder; $num++) {
> print STDERR " [at] {$linkder[$num]}\n";
> }
>
> or even
>
> for (my $num = 0; $num <= $#{$links}; $num++) {
> print STDERR " [at] {$links->[$num]}\n";
> }
>
> but it is much clear and more Perlish to write
>
> foreach my $link ( [at] $links) {
> print STDERR " [at] {$link}\n";
> }
>
> Remember: everywhere you could put a simple variable identifier you can
> put a reference. Surrounding it in braces is always valid and helps
> resolve ambiguity, so [at] linkder is the same as [at] {linkder} is the same as
> [at] {$links}. Likewise, $linkder[$num] (or $links->[$num]) is an array
> reference, and can be dereferenced with [at] {$linkder[$num]}.
>
> HTH,
>
> Rob
You also might want to look into Data::Dumper to see exactly what you're
working with.
Off the top though:
map { [at] $_} [at] $arr
might be a start.
_____________________________________________
Thank you Rob and Shawn. It worked well.
I can't claim I know fully why, but it worked.
I sure hope Perl6 gets rid of dereferencing.
Mike Flannigan
--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/