Parsing HTML with Regular Expressions

I am trying to pull out an href from a bit of javascript. I am running
php, but the RE should be the same....

What I have is this:

<a href="JavaScript:void(0)"
onClick="JavaScript:window.open(\'http://www.seiner.com/blog /Travels/images/wp-snapshot.php?image=http://www.seiner.com/ blog/Travels/images/2.jpg&width=730&height=755\',
\'FamilyPic\', \'scrollbars=yes,height=755,width=730,location=no\');
return false"><img
src="http://www.seiner.com/blog/Travels/images/thumb-2.jpg"/ ></a>

What I want to do is pull out the URL in the window.open call but only
if it doesn't contain either a next=[whatever] or a prev=[whatever] tag.

In other words, the above href doesn't contain either one, so my RE
returns 'http://www.seiner.com/blog/Travels/images/1.jpg'.

But if the above URL were to be as follows (see the next and prev at the
end of the URL):

<a href="JavaScript:void(0)"
onClick="JavaScript:window.open(\'http://www.seiner.com/blog /Travels/images/wp-snapshot.php?image=http://www.seiner.com/ blog/Travels/images/2.jpg&width=730&height=755&prev=4.jpg&ne xt=2.jpg\',
\'FamilyPic\', \'scrollbars=yes,height=755,width=730,location=no\');
return false"><img
src="http://www.seiner.com/blog/Travels/images/thumb-2.jpg"/ ></a>

I want the RE to not match....

The RE I am using is

$re = '<[aA] .*image=([a-zA-Z0-9.:/-]*).*/>';

and the actual match is done via:

preg_match_all ( $re, $text , $matches, PREG_OFFSET_CAPTURE);

TIA...
Captain Dondo [ Mi, 15 Juni 2005 17:39 ] [ ID #839192 ]
Perl » alt.perl » Parsing HTML with Regular Expressions

Vorheriges Thema: Is network modules are working on every OS ?
Nächstes Thema: perl - python equivalent