RE Perl Pattern matching
Hi,
I am having a string say $str, the value of it is as
below:
<responseStatus>HTTP/1.1 200 OK</responseStatus>
<cookies>
<cookie name="ASPSESSIONIDSQDCBDBA" path="/" domain="www-
int.juniper.net">DOCFGJEAKNOMBLHCGEMOIMBA</cookie>
</cookies>
<headers>
<header name="Cache-control">private</header>
<header name="Content-Encoding">deflate</header>
<header name="Content-Type">text/html</header>
<header name="Date">Wed, 26 Mar 2008 04:48:16 GMT</header>
<header name="Server">Concealed by Juniper Networks Redline EX</
header>
<header name="Set-
Cookie">ASPSESSIONIDSQDCBDBA=DOCFGJEAKNOMBLHCGEMOIMBA; path=/</header>
<header name="Transfer-Encoding">chunked</header>
<header name="Vary">Accept-Encoding, User-Agent</header>
<header name="Via">1.1 sac-p-green-dx2 (Juniper Networks
Application Acceleration Platform - DX 5.1.8 0)</header>
<header name="Warning">214 www-int.juniper.net "Juniper
Networks DX Active"</header>
<header name="X-Powered-By">ASP.NET</header>
</headers>
<content>
<contentLength>27887</contentLength>
<compression>71.3</compression>
<encodingScheme>deflate</encodingScheme>
<text><![CDATA[
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"..."http://
www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">..<html>..<head>....<title>
Intranet Home Page</title>..<script language="JavaScript" type="text/
javascript">..function clicker()..{..document.seek2.qt.value =
document.seek1.qt.value;..return true;..}</form>.. <!-- close Main2 --
>..</div><!-- close Main1 -->....</body>..</html>..
]]></text>
<mimeType>text/html</mimeType>
</content>
----------------
Now i want to get everything between "<text><![CDATA[" and "]]></
text>" [ie i need to capture the CDATA section]and i am using the
below code
if( $str =~ m#<text><!\[CDATA\[(.*)\]\]></text># )
{
print $1;
}
But not getting anything. Can anyone find out the fault in it?
Re: RE Perl Pattern matching
On Apr 2, 2:23 pm, Deepan Perl XML Parser <deepan... [at] gmail.com> wrote:
> if( $str =~ m#<text><!\[CDATA\[(.*)\]\]></text># )
> {
> print $1;
>
> }
>
> But not getting anything. Can anyone find out the fault in it?
You need an "s" at the end:
if( $str =~ m#<text><!\[CDATA\[(.*)\]\]></text>#s )
See http://perldoc.perl.org/perlre.html#Modifiers
Re: RE Perl Pattern matching
On Apr 2, 10:30 am, Ben Bullock <benkasminbull... [at] gmail.com> wrote:
> On Apr 2, 2:23 pm, Deepan Perl XML Parser <deepan... [at] gmail.com> wrote:
>
> > if( $str =~ m#<text><!\[CDATA\[(.*)\]\]></text># )
> > {
> > print $1;
>
> > }
>
> > But not getting anything. Can anyone find out the fault in it?
>
> You need an "s" at the end:
>
> if( $str =~ m#<text><!\[CDATA\[(.*)\]\]></text>#s )
>
> Seehttp://perldoc.perl.org/perlre.html#Modifiers
Thank You Ben!
Re: Perl Pattern matching
Deepan Perl XML Parser wrote:
> Now i want to get everything between "<text><![CDATA[" and "]]></
> text>" [ie i need to capture the CDATA section]and i am using the
> below code
>
> if( $str =~ m#<text><!\[CDATA\[(.*)\]\]></text># )
> {
> print $1;
> }
Your expression is (besides the /s modifier) perfectly valid
but I'd like to make an additional remark. You could strip
the newline characters (if any) and extract more than one
CDATA section, sth. like:
my $reg = qr{
<text> # find section <text>
<!\[CDATA\[ [\r\n]? # which contains another CDATA section
(.+?) # capture the CDATA lines but ?check? \]\]
[\r\n]?\]\]> # until CDATA terminator
</text> # maybe even the <text> is closed properly
}sx;
print $1 while $str =~ /$reg/g; # extract each CDATA section
Regards
M.
Re: RE Perl Pattern matching
On 2008-04-02, Deepan Perl XML Parser <deepan.17 [at] gmail.com> wrote:
> Hi,
<much XML snipped>
>
>
> But not getting anything. Can anyone find out the fault in it?
You're trying to parse XML with regular expressions. Don't do that.
Perl has a large selection of excellent modules for processing XML.
Use them.
--
Christopher Mattern
NOTICE
Thank you for noticing this new notice
Your noticing it has been noted
And will be reported to the authorities
Re: RE Perl Pattern matching
On Wed, 02 Apr 2008 10:53:34 -0500, Chris Mattern wrote:
> You're trying to parse XML with regular expressions. Don't do that.
> Perl has a large selection of excellent modules for processing XML. Use
> them.
Chris, do you talk like that to people in real life, or is it just the
internet?
Re: RE Perl Pattern matching
>>>>> "BB" == Ben Bullock <benkasminbullock [at] gmail.com> writes:
BB> On Wed, 02 Apr 2008 10:53:34 -0500, Chris Mattern wrote:
>> You're trying to parse XML with regular expressions. Don't do
>> that. Perl has a large selection of excellent modules for
>> processing XML. Use them.
BB> Chris, do you talk like that to people in real life, or is it
BB> just the internet?
When you've said the same thing over and over to people who aren't
getting it, there is a clear temptation to speak slowly, with short
sentences and short words.
Charlton
--
Charlton Wilbur
cwilbur [at] chromatico.net
Re: RE Perl Pattern matching
On Wed, 02 Apr 2008 22:16:09 +0000, Ben Bullock wrote:
> On Wed, 02 Apr 2008 10:53:34 -0500, Chris Mattern wrote:
>
>
>> You're trying to parse XML with regular expressions. Don't do that.
>> Perl has a large selection of excellent modules for processing XML. Use
>> them.
>
> Chris, do you talk like that to people in real life, or is it just the
> internet?
I do. Even (especially?) if someone is new around here and is making a
mistake thousands have made before.
M4