Regexp to search over several lines in one string

Hi!

I have a string, and I want to remove everything behind the ">"
character. The string contains new line characters that I don't want
to remove.

my $string = "line1
line2>
line3";

Why don't I get a match and replacement with this?

$string =~ s/^([^>]*>)/$1/;

I would expect the string to contain:

"line1
line2>"

But it still contains "line3"!!!

Why is this?
Any suggestions for how to do this in an other 8working) manner?

Best Regards,
Andreas - Sweden
d99alu [ So, 27 Januar 2008 13:18 ] [ ID #1917650 ]

Re: Regexp to search over several lines in one string

d99alu [at] efd.lth.se schreef:

> I have a string, and I want to remove everything behind the ">"
> character. The string contains new line characters that I don't want
> to remove.

s/(?:<=>).*//s;

See perldoc perlre, search for "look-behind".

--
Affijn, Ruud

"Gewoon is een tijger."
rvtol+news [ So, 27 Januar 2008 13:39 ] [ ID #1917652 ]

Re: Regexp to search over several lines in one string

d99alu [at] efd.lth.se wrote:
> I have a string, and I want to remove everything behind the ">"
> character. The string contains new line characters that I don't want
> to remove.
>
> my $string = "line1
> line2>
> line3";
>
> Why don't I get a match and replacement with this?
>
> $string =~ s/^([^>]*>)/$1/;

It does match, but since you capture everything, and insert the captured
string using $1, nothing gets changed.

> I would expect the string to contain:
>
> "line1
> line2>"
>
> But it still contains "line3"!!!
>
> Why is this?

Because your regex does not match the "line3" portion of the string.

> Any suggestions for how to do this in an other 8working) manner?

One way to remove everything after the '>' character would be:

$string =~ s/[^>]+$//;

However, that removes the newline between "line2>" and "line3" as well...

This removes everything after '>' but newlines:

$string =~ s{([^>]+)$}{
my $rm = $1;
$rm =~ s/.+//g;
$rm;
}e;

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
Gunnar Hjalmarsson [ So, 27 Januar 2008 14:11 ] [ ID #1917653 ]

Re: Regexp to search over several lines in one string

Dr.Ruud wrote:
> d99alu [at] efd.lth.se schreef:
>
>> I have a string, and I want to remove everything behind the ">"
>> character. The string contains new line characters that I don't want
>> to remove.
>
> s/(?:<=>).*//s;

ITYM: s/(?<=>).*//s;


John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order. -- Larry Wall
someone [ So, 27 Januar 2008 16:06 ] [ ID #1917655 ]

Re: Regexp to search over several lines in one string

d99alu [at] efd.lth.se wrote:
> Hi!
>
> I have a string, and I want to remove everything behind the ">"
> character. The string contains new line characters that I don't want
> to remove.
>
> my $string = "line1
> line2>
> line3";
>
> Why don't I get a match and replacement with this?
>
> $string =~ s/^([^>]*>)/$1/;
>
> I would expect the string to contain:
>
> "line1
> line2>"
>

$string =~ s/^([^>]*>).*$/$1/s;

line1
line2>

--
Petr Vileta, Czech republic
(My server rejects all messages from Yahoo and Hotmail. Send me your
mail from another non-spammer site please.)

Please reply to <petr AT practisoft DOT cz>
Petr Vileta [ So, 27 Januar 2008 16:39 ] [ ID #1917656 ]

Re: Regexp to search over several lines in one string

John W. Krahn schreef:
> Dr.Ruud:
>> d99alu:

>>> I have a string, and I want to remove everything behind the ">"
>>> character. The string contains new line characters that I don't want
>>> to remove.
>>
>> s/(?:<=>).*//s;
>
> ITYM: s/(?<=>).*//s;

Yes. (aaargh, oops again)

--
Affijn, Ruud

"Gewoon is een tijger."
rvtol+news [ So, 27 Januar 2008 16:59 ] [ ID #1917658 ]

Re: Regexp to search over several lines in one string

Dr.Ruud schreef:
> d99alu:

>> I have a string, and I want to remove everything behind the ">"
>> character. The string contains new line characters that I don't want
>> to remove.
>
> s/(?:<=>).*//s;
>
> See perldoc perlre, search for "look-behind".

I also forgot the newline. Maybe this does what you need:

s/(?<=>).*/\n/s;

(doesn't keep any of the original newlines; even adds one when none was
there)

--
Affijn, Ruud

"Gewoon is een tijger."
rvtol+news [ So, 27 Januar 2008 17:11 ] [ ID #1917659 ]

Re: Regexp to search over several lines in one string

Petr Vileta wrote:
>
> $string =~ s/^([^>]*>).*$/$1/s;

The '$' character is redundant after .*

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
Gunnar Hjalmarsson [ So, 27 Januar 2008 19:02 ] [ ID #1917661 ]

Re: Regexp to search over several lines in one string

Gunnar Hjalmarsson wrote:
> d99alu [at] efd.lth.se wrote:
>> I have a string, and I want to remove everything behind the ">"
>> character. The string contains new line characters that I don't want
>> to remove.
>>
>> my $string = "line1
>> line2>
>> line3";
>>
>> Why don't I get a match and replacement with this?
>>
>> $string =~ s/^([^>]*>)/$1/;
>
> It does match, but since you capture everything, and insert the captured
> string using $1, nothing gets changed.

I have a feeling that the code above actually is an attempt to do:

if ( $string =~ /^([^>]*>)/ ) {
$string = $1;
}

That replaces the content of _$string_ with what was captured in the
regex. However, it's accomplished via the m// operator, while you were
using the s/// operator.

I recommend that you read up on both those operators in "perldoc perlop".

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
Gunnar Hjalmarsson [ So, 27 Januar 2008 22:48 ] [ ID #1917664 ]

Re: Regexp to search over several lines in one string

Gunnar Hjalmarsson schreef:
> Petr Vileta:

>> $string =~ s/^([^>]*>).*$/$1/s;
>
> The '$' character is redundant after .*

Yes, in this case (because of the s-modfier) it is.

$ echo "abcd" |perl -pe 's/(.*)$/"<".++$i."=$1:".length($1).">\n"/ge'
<1=abcd:4>
<2=:0>

<3=:0>


$ echo "abcd" |perl -pe 's/(.*)$/"<".++$i."=$1:".length($1).">\n"/sge'
<1=abcd
:5>
<2=:0>

--
Affijn, Ruud

"Gewoon is een tijger."
rvtol+news [ Mo, 28 Januar 2008 19:31 ] [ ID #1918539 ]
Perl » comp.lang.perl.misc » Regexp to search over several lines in one string

Vorheriges Thema: FAQ 9.20 How do I send mail?
Nächstes Thema: FAQ 9.1 What is the correct form of response from a CGI script?