Parse Key=Val parameters with s///eg

I'm trying to write a routine to parse a string containing a series of =
parameters of the form

KEYWORD =3D VALUE
or just
KEYWORD (where the value defaults to 1)

I was able to write it using a WHILE loop, but then thought I'd try =
using the 'g' option of s/// to do the iteration.

It seems to parse correctly-formed strings fine, but it is not *failing* =
to match an incorrectly-formed string.

The following is what I've got so far. $line is set to a valid string, =
while $badline is set to an invalid one - which, nevertheless, is parsed =
(partially) as valid.

(Incidentally, as I post this I don't know whether my lines will be =
hard-wrapped at some ridiculous length like 60, thus screwing up the =
code by wrapping my regex comments. What causes that to happen? and, =
more to the point, is that something I can override in my POP client?)

Also - if there's a module I should be using instead of re-inventing the =
wheel, I'd appreciate knowing about it. But in any event, I'd like to =
understand why my approach isn't working.

Here's the code. Thanks for any help.
Chap

- - - - -

#!/usr/bin/perl

use warnings;
use strict;
use feature ":5.10";

#
# $line, unless empty, should contain one or more white-space-separated
# expressions of the form
# FOO
# or BAZ =3D BAR
#
# We need to parse them and set
# $param{FOO} =3D 1 # default if value is omitted
# $param{BAZ} =3D 'BAR'
#
# Valid input example:
# MIN=3D2 MAX =3D 12 WEIGHTED TOTAL=3D 20
# Yields:
# $param{MIN} =3D '2'
# $param{MAX} =3D '12'
# $param{WEIGHTED} =3D 1
# $param{TOTAL} =3D '20'
#

my $line =3D 'min=3D2 max =3D 12 weighted total=3D 20';
my $badline =3D 'min=3D2 max, =3D 12 weighted total=3D 20';

my %param;

if ( $line and
($line !~
s/
\G # Begin where prev. left off
(?: # Either a parameter...
(?: # Keyword clause:
(\w+) # KEYWORD (captured $1)
(?: # Value clause:
\s* #
=3D # equal sign
\s* #
(\w+) # VALUE (captured $2)
)? # Value clause is optional
)
\s* # eat up any trailing ws
| # ... or ...
$ # End of line.
)
/ # use captured to set %param
$param{uc $1} =3D ( $2 ? $2 : 1 ) if defined $1;
/xeg
) ) {
say "Syntax error: '$line'";
while (my ($x, $y) =3D each %param) {say "$x=3D'$y'";}
exit;
}
while (my ($x, $y) =3D each %param) {say "$x=3D'$y'";}


--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
Chap Harrison [ Mi, 16 März 2011 17:08 ] [ ID #2056665 ]

Re: Parse Key=Val parameters with s///eg

Oops, I misplaced the final closing parenthesis in the regex. But it =
doesn't seem to matter.

- - - - -

#!/usr/bin/perl

use warnings;
use strict;
use feature ":5.10";

#
# $line, unless empty, should contain one or more white-space-separated
# expressions of the form
# FOO
# or BAZ =3D BAR
#
# We need to parse them and set
# $param{FOO} =3D 1 # default if value is omitted
# $param{BAZ} =3D 'BAR'
#
# Valid input example:
# MIN=3D2 MAX =3D 12 WEIGHTED TOTAL=3D 20
# $param{MIN} =3D '2'
# $param{MAX} =3D '12'
# $param{WEIGHTED} =3D 1
# $param{TOTAL} =3D '20'
#

my $line =3D 'min=3D2 max =3D 12 weighted total=3D 20';
$line =3D 'min=3D2 max, =3D 12 weighted total=3D 20';
say $line;
my %param;

if ( $line and
($line !~
s/
\G # Begin where prev. left off
(?: # Either a parameter...
(?: # Keyword clause:
(\w+) # KEYWORD (captured $1)
(?: # Value clause:
\s* #
=3D # equal sign
\s* #
(\w+) # VALUE (captured $2)
)? # Value clause is optional
)
\s* # eat up any trailing ws
) ### <-- moved
| # ... or ...
$ # End of line.
/ # use captured to set %param
$param{uc $1} =3D ( $2 ? $2 : 1 ) if defined $1;
/xeg
) ) {
say "Syntax error: '$line'";
while (my ($x, $y) =3D each %param) {say "$x=3D'$y'";}
exit;
}
while (my ($x, $y) =3D each %param) {say "$x=3D'$y'";}



--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
Chap Harrison [ Mi, 16 März 2011 17:58 ] [ ID #2056676 ]

Re: Parse Key=Val parameters with s///eg

On Mar 16, 9:58=A0am, c... [at] pobox.com (Chap Harrison) wrote:
> Oops, I misplaced the final closing parenthesis in the regex. =A0But it d=
oesn't seem to matter.
>
> - - - - -
>
> #!/usr/bin/perl
>
> use warnings;
> use strict;
> use feature ":5.10";
>
> #
> # $line, unless empty, should contain one or more white-space-separated
> # expressions of the form
> # =A0 =A0 =A0 FOO
> # or =A0 =A0BAZ =3D BAR
> #
> # We need to parse them and set
> # $param{FOO} =3D 1 =A0 =A0 =A0 # default if value is omitted
> # $param{BAZ} =3D 'BAR'
> #
> # Valid input example:
> # =A0 MIN=3D2 MAX =3D 12 =A0WEIGHTED TOTAL=3D 20
> # $param{MIN} =3D '2'
> # $param{MAX} =3D '12'
> # $param{WEIGHTED} =3D 1
> # $param{TOTAL} =3D '20'
> #
>
> my $line =3D 'min=3D2 max =3D 12 weighted total=3D 20';
> $line =3D 'min=3D2 max, =3D 12 weighted total=3D 20';
> say $line;
> my %param;
>
> if ( $line and
> =A0 =A0 =A0($line !~
> =A0 =A0 =A0 =A0 =A0 =A0s/
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 \G =A0 =A0 =A0 =A0 =A0 =A0# Begin where p=
rev. left off
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 (?: =A0 =A0 =A0 =A0 =A0 # Either a parame=
ter...
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 (?: =A0 =A0 =A0 =A0 =A0 =A0# Keyw=
ord clause:
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 (\w+) =A0 =A0 =A0# KEYWOR=
D (captured $1)
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 (?: =A0 =A0 =A0 =A0# Valu=
e clause:
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 \s* =A0 =A0#
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =3D =A0 =A0 =A0# =
equal sign
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 \s* =A0 =A0#
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 (\w+) =A0# VALUE =
(captured $2)
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 )? =A0 =A0 =A0 =A0 # Valu=
e clause is optional
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 )
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 \s* =A0 =A0 =A0 =A0 =A0 =A0# eat =
up any trailing ws
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ) =A0 =A0 =A0 =A0 =A0 =A0 ### <-- moved
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 | =A0 =A0 =A0 =A0 =A0 =A0 # ... or ...
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 $ =A0 =A0 =A0 =A0 # End of line.
> =A0 =A0 =A0 =A0 =A0 =A0 / =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 # use captured =
to set %param
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 $param{uc $1} =3D ( $2 ? $2 : 1 ) if defi=
ned $1;
> =A0 =A0 =A0 =A0/xeg
> =A0 =A0) ) {
> =A0 =A0 say "Syntax error: '$line'";
> =A0 =A0 while (my ($x, $y) =3D each %param) {say "$x=3D'$y'";}
> =A0 =A0 exit;}
>
> while (my ($x, $y) =3D each %param) {say "$x=3D'$y'";}

I believe the problem is the "? # Value clause is optional"
since, in the case of your badline with a ",", the regex will
consume 'max' and then ignore the , since ? means 0 or 1
instance. Therefore the regex will still succeed and $2 will
be undefined. So the VALUE gets set to 1.

Rather than crafting a still more complicated regex, a
quick fix might be a pre-check to see if the line has any
character that's not a \w, =3D, or whitespace:

# disallow anything but equal, word, or whitespace

die "Syntax error: found disallowed char:$1"
if $line =3D~ /([^=3D\w\s])/);

# continue to process line...
if ( $line and $line !~ ....

--
Charles DeRykus


--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
derykus [ Do, 17 März 2011 19:49 ] [ ID #2056739 ]

Re: Parse Key=Val parameters with s///eg

On Mar 17, 2011, at 1:49 PM, C.DeRykus wrote:

> On Mar 16, 9:58 am, c... [at] pobox.com (Chap Harrison) wrote:
>>
>> #!/usr/bin/perl
>>
>> use warnings;
>> use strict;
>> use feature ":5.10";
>>
>> #
>> # $line, unless empty, should contain one or more =
white-space-separated
>> # expressions of the form
>> # FOO
>> # or BAZ =3D BAR
>> #
>> # We need to parse them and set
>> # $param{FOO} =3D 1 # default if value is omitted
>> # $param{BAZ} =3D 'BAR'
>> #
>> # Valid input example:
>> # MIN=3D2 MAX =3D 12 WEIGHTED TOTAL=3D 20
>> # $param{MIN} =3D '2'
>> # $param{MAX} =3D '12'
>> # $param{WEIGHTED} =3D 1
>> # $param{TOTAL} =3D '20'
>> #
>>
>> my $line =3D 'min=3D2 max =3D 12 weighted total=3D 20';
>> $line =3D 'min=3D2 max, =3D 12 weighted total=3D 20';
>> say $line;
>> my %param;
>>
>> if ( $line and
>> ($line !~
>> s/
>> \G # Begin where prev. left off
>> (?: # Either a parameter...
>> (?: # Keyword clause:
>> (\w+) # KEYWORD (captured $1)
>> (?: # Value clause:
>> \s* #
>> =3D # equal sign
>> \s* #
>> (\w+) # VALUE (captured $2)
>> )? # Value clause is optional
>> )
>> \s* # eat up any trailing ws
>> ) ### <-- moved
>> | # ... or ...
>> $ # End of line.
>> / # use captured to set %param
>> $param{uc $1} =3D ( $2 ? $2 : 1 ) if defined $1;
>> /xeg
>> ) ) {
>> say "Syntax error: '$line'";
>> while (my ($x, $y) =3D each %param) {say "$x=3D'$y'";}
>> exit;}
>>
>> while (my ($x, $y) =3D each %param) {say "$x=3D'$y'";}
>
> I believe the problem is the "? # Value clause is optional"
> since, in the case of your badline with a ",", the regex will
> consume 'max' and then ignore the , since ? means 0 or 1
> instance. Therefore the regex will still succeed and $2 will
> be undefined. So the VALUE gets set to 1.
>

I agree - encountering the ',' causes the regex to think it's =
encountered a keyword without a value. But why doesn't the *next* =
iteration of the global substitution (which would begin at the ',') =
fail, causing the if-statement to succeed and print "Syntax error"?

Perhaps I don't fully understand how the /g option works.... I thought =
it would continue to "iterate" until either it reached the end of the =
string (in which case the s/// would be considered to have succeeded) or =
it could not match anything further (in which case it would be =
considered to have failed).




--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
Chap Harrison [ Do, 17 März 2011 21:27 ] [ ID #2056745 ]

Re: Parse Key=Val parameters with s///eg

On 17/03/2011 20:27, Chap Harrison wrote:
> On Mar 17, 2011, at 1:49 PM, C.DeRykus wrote:
>> On Mar 16, 9:58 am, c... [at] pobox.com (Chap Harrison) wrote:
>>>
[snip]
>>>
>>> my $line = 'min=2 max = 12 weighted total= 20';
>>> $line = 'min=2 max, = 12 weighted total= 20';
>>> say $line;
>>> my %param;
>>>
>>> if ( $line and
>>> ($line !~
>>> s/
>>> \G # Begin where prev. left off
>>> (?: # Either a parameter...
>>> (?: # Keyword clause:
>>> (\w+) # KEYWORD (captured $1)
>>> (?: # Value clause:
>>> \s* #
>>> = # equal sign
>>> \s* #
>>> (\w+) # VALUE (captured $2)
>>> )? # Value clause is optional
>>> )
>>> \s* # eat up any trailing ws
>>> ) ###<-- moved
>>> | # ... or ...
>>> $ # End of line.
>>> / # use captured to set %param
>>> $param{uc $1} = ( $2 ? $2 : 1 ) if defined $1;
>>> /xeg
>>> ) ) {
>>> say "Syntax error: '$line'";
>>> while (my ($x, $y) = each %param) {say "$x='$y'";}
>>> exit;}
>>>
>>> while (my ($x, $y) = each %param) {say "$x='$y'";}
>>
>> I believe the problem is the "? # Value clause is optional"
>> since, in the case of your badline with a ",", the regex will
>> consume 'max' and then ignore the , since ? means 0 or 1
>> instance. Therefore the regex will still succeed and $2 will
>> be undefined. So the VALUE gets set to 1.
>>
>
> I agree - encountering the ',' causes the regex to think it's
> encountered a keyword without a value. But why doesn't the *next*
> iteration of the global substitution (which would begin at the ',')
> fail, causing the if-statement to succeed and print "Syntax error"?
>
> Perhaps I don't fully understand how the /g option works.... I
> thought it would continue to "iterate" until either it reached the
> end of the string (in which case the s/// would be considered to have
> succeeded) or it could not match anything further (in which case it
> would be considered to have failed).

Hi Chap

A s///g is 'successful' if it performs at least one substitution, in
which case it will return the number of substitutions made. In your
code, it will find as many key=value substrings as possible and replace
them with just the value string.

The \G pattern is documented only in the case of m//g, which makes sense
as it is defined in terms of a character position (pos) within the
string where the last match ended. If a substitution is being made then
it will also affect character positions, and so is similar to adding to
or deleting from an array while iterating over it.

It is bad form to use the /e modifier to generate side-effects (just as
it is wrong to do so with the map operator).

I believe a while loop is the proper way to go, but if you want to
experiment with m//g I suggest something like this

my %matches = $line =~ /\G\s*(\w+)(?:\s*=\s*(\w+))?\s*/gc;

which will pass all the 'key' and 'key=value' pairs to %matches. An
invalid input will cause the match to terminate before the end of the
string, so

if (pos $line < length $line) {
# code to handle bad input
}

If a key has no corresponding value in the string it will appear in the
hash with a value of undef, which should be defaulted to 1 like this

foreach (values %matches) {
$_ = 1 if not defined;
}

I hope that helps a little.

Rob



--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
Rob Dixon [ Do, 17 März 2011 23:21 ] [ ID #2056754 ]

Re: Parse Key=Val parameters with s///eg

On Mar 17, 1:27=A0pm, c... [at] pobox.com (Chap Harrison) wrote:
> On Mar 17, 2011, at 1:49 PM, C.DeRykus wrote:
>
>
>
> > On Mar 16, 9:58 am, c... [at] pobox.com (Chap Harrison) wrote:
>
> >> #!/usr/bin/perl
>
> >> use warnings;
> >> use strict;
> >> use feature ":5.10";
>
> >> #
> >> # $line, unless empty, should contain one or more white-space-separate=
d
> >> # expressions of the form
> >> # =A0 =A0 =A0 FOO
> >> # or =A0 =A0BAZ =3D BAR
> >> #
> >> # We need to parse them and set
> >> # $param{FOO} =3D 1 =A0 =A0 =A0 # default if value is omitted
> >> # $param{BAZ} =3D 'BAR'
> >> #
> >> # Valid input example:
> >> # =A0 MIN=3D2 MAX =3D 12 =A0WEIGHTED TOTAL=3D 20
> >> # $param{MIN} =3D '2'
> >> # $param{MAX} =3D '12'
> >> # $param{WEIGHTED} =3D 1
> >> # $param{TOTAL} =3D '20'
> >> #
>
> >> my $line =3D 'min=3D2 max =3D 12 weighted total=3D 20';
> >> $line =3D 'min=3D2 max, =3D 12 weighted total=3D 20';
> >> say $line;
> >> my %param;
>
> >> if ( $line and
> >> =A0 =A0 =A0($line !~
> >> =A0 =A0 =A0 =A0 =A0 =A0s/
> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 \G =A0 =A0 =A0 =A0 =A0 =A0# Begin wher=
e prev. left off
> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 (?: =A0 =A0 =A0 =A0 =A0 # Either a par=
ameter...
> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 (?: =A0 =A0 =A0 =A0 =A0 =A0# K=
eyword clause:
> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 (\w+) =A0 =A0 =A0# KEY=
WORD (captured $1)
> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 (?: =A0 =A0 =A0 =A0# V=
alue clause:
> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 \s* =A0 =A0#
> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =3D =A0 =A0 =
=A0# equal sign
> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 \s* =A0 =A0#
> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 (\w+) =A0# VAL=
UE (captured $2)
> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 )? =A0 =A0 =A0 =A0 # V=
alue clause is optional
> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 )
> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 \s* =A0 =A0 =A0 =A0 =A0 =A0# e=
at up any trailing ws
> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 ) =A0 =A0 =A0 =A0 =A0 =A0 ### <-- move=
d
> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 | =A0 =A0 =A0 =A0 =A0 =A0 # ... or ...
> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 $ =A0 =A0 =A0 =A0 # End of lin=
e.
> >> =A0 =A0 =A0 =A0 =A0 =A0 / =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 # use captur=
ed to set %param
> >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 $param{uc $1} =3D ( $2 ? $2 : 1 ) if d=
efined $1;
> >> =A0 =A0 =A0 =A0/xeg
> >> =A0 =A0) ) {
> >> =A0 =A0 say "Syntax error: '$line'";
> >> =A0 =A0 while (my ($x, $y) =3D each %param) {say "$x=3D'$y'";}
> >> =A0 =A0 exit;}
>
> >> while (my ($x, $y) =3D each %param) {say "$x=3D'$y'";}
>
> > I believe the problem is the "? =A0 # Value clause is optional"
> > since, in the case of your badline with a ",", the regex will
> > consume 'max' and then ignore the , since ? means 0 or 1
> > instance. =A0Therefore the regex will still succeed and $2 will
> > be undefined. So the VALUE gets set to 1.
>
> I agree - encountering the ',' causes the regex to think it's encountered=
a keyword without a value. =A0But why doesn't the *next* iteration of the =
global substitution (which would begin at the ',') fail, causing the if-sta=
tement to succeed and print "Syntax error"?
>
> Perhaps I don't fully understand how the /g option works.... =A0I thought=
it would continue to "iterate" until either it reached the end of the stri=
ng (in which case the s/// would be considered to have succeeded) or it cou=
ld not match anything further (in which case it would be considered to have=
failed).

It does iterates through the string until match failure or
end of string. The regex returns the count of successful
matches but, due to the !~ , the count is negated and
returned. So, only if there had been no matches at all,
would the negated return have returned true and taken
the syntax error branch.

For instance, this fails to match immediately since 'a' doesn't
match \d and the negated return of false causes "true" to print:

perl -wle "my $x=3D'abc123'; print 'true' if $x and $x !~ s/\G\d}//
g"

But, this matches once before failing and the negated return of
that count causes the statement qualifier to fail so nothing gets
printed:

perl -wle "my $x=3D'1abc23'; print 'true' if $x and $x !~ s/\G\d//g"


See: perldoc perlop for details about the substitution
operator and the \G assertion.

--
Charles DeRykus


--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
derykus [ Do, 17 März 2011 23:43 ] [ ID #2056755 ]

Re: Parse Key=Val parameters with s///eg

On Mar 17, 2011, at 5:21 PM, Rob Dixon wrote:

> A s///g is 'successful' if it performs at least one substitution, in =
which case it will return the number of substitutions made. In your =
code, it will find as many key=3Dvalue substrings as possible and =
replace them with just the value string.
>
> The \G pattern is documented only in the case of m//g, which makes =
sense as it is defined in terms of a character position (pos) within the =
string where the last match ended. If a substitution is being made then =
it will also affect character positions, and so is similar to adding to =
or deleting from an array while iterating over it.
>
> It is bad form to use the /e modifier to generate side-effects (just =
as it is wrong to do so with the map operator).

All very useful info - thanks.

> I believe a while loop is the proper way to go, but if you want to =
experiment with m//g I suggest something like this
>
> my %matches =3D $line =3D~ /\G\s*(\w+)(?:\s*=3D\s*(\w+))?\s*/gc;
>
> which will pass all the 'key' and 'key=3Dvalue' pairs to %matches. An =
invalid input will cause the match to terminate before the end of the =
string, so
>
> if (pos $line < length $line) {
> # code to handle bad input
> }
>
> If a key has no corresponding value in the string it will appear in =
the hash with a value of undef, which should be defaulted to 1 like this
>
> foreach (values %matches) {
> $_ =3D 1 if not defined;
> }
>
> I hope that helps a little.

It helped a lot. At this point I'd agree that the while loop is the =
most straightforward approach.

Thanks again.

Chap


--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
Chap Harrison [ Do, 17 März 2011 23:44 ] [ ID #2056756 ]

Re: Parse Key=Val parameters with s///eg

On Mar 17, 3:44=A0pm, c... [at] pobox.com (Chap Harrison) wrote:
> On Mar 17, 2011, at 5:21 PM, Rob Dixon wrote:
>
> > A s///g is 'successful' if it performs at least one substitution, in wh=
ich case it will return the number of substitutions made. In your code, it =
will find as many key=3Dvalue substrings as possible and replace them with =
just the value string.
>
> > The \G pattern is documented only in the case of m//g, which
> > makes sense as it is defined in terms of a character position
> > (pos) within the string where the last match ended.

Um, the substitution operator uses m/// to pattern match
as well so you'll sometimes see \G in s///.

> > If a substitution is being made then it will also affect character
> > positions, and so is similar to adding to or deleting from an array
> > while iterating over it.

No, I don't think so. Or, am I missing something obvious?
For instance, the char. positions within the original string
being pattern matched are unaffected by s/// replacements:

perl -wle "$_=3D '123abc4'; s/\G(\d)(?{print pos()})/$1 . 'foo'/
eg;print"
1
2
3
1foo2foo3fooabc4

> ...
>
> > I believe a while loop is the proper way to go, but if you want to expe=
riment with m//g I suggest something like this
>
> > =A0my %matches =3D $line =3D~ /\G\s*(\w+)(?:\s*=3D\s*(\w+))?\s*/gc;
>
> > which will pass all the 'key' and 'key=3Dvalue' pairs to %matches. An i=
nvalid input will cause the match to terminate before the end of the string=
, so
>
> > =A0if (pos $line < length $line) {
> > =A0 =A0# code to handle bad input
> > =A0}
>

Neat solution. IMO, though, it's much clearer and simpler
(particularly for subsequent maintainers), to identify errors
up front if you can and save the time and complexity to
review \G and pos() and the implication of a s/// return
count (unless of course you're using them all the time).

--> "bad char: $1" if $line =3D~ / ([^\w=3D\s]) /x;

> > If a key has no corresponding value in the string it will appear in the=
hash with a value of undef, which should be defaulted to 1 like this
>
> > =A0foreach (values %matches) {
> > =A0 =A0$_ =3D 1 if not defined;
> > =A0}
>

You can also use the defined-or operator.

$_ //=3D 1 foreach (values %matches)

--
Charles DeRykus


--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
derykus [ Fr, 18 März 2011 18:03 ] [ ID #2056821 ]

Re: Parse Key=Val parameters with s///eg

On Mar 18, 2011, at 12:03 PM, C.DeRykus wrote:

> Neat solution. IMO, though, it's much clearer and simpler
> (particularly for subsequent maintainers), to identify errors
> up front if you can and save the time and complexity to
> review \G and pos() and the implication of a s/// return
> count (unless of course you're using them all the time).

This raises the question: is it possible, using a recursive regex and =
s/// or m//, to - in one fell swoop - identify a correctly-formed string =
AND save the key=3Dvalue pairs into a hash?

Eons ago, when writing a SNOBOL4 program to match an arbitrary algebraic =
expression, I tried to make it evaluate the expression as it "unwound." =
I never succeeded -- so it's become something of a holy grail for me. =
Not a highly important one, though ;-)



--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
Chap Harrison [ Fr, 18 März 2011 23:24 ] [ ID #2056833 ]

Re: Parse Key=Val parameters with s///eg

--00235453097036950d049ecdd4e4
Content-Type: text/plain; charset=ISO-8859-1

Yes and no. Technically I suppose you can, maybe doing something like this:
use 5.010;
use Data::Dumper;

$_ = "KEYWORD = VALUE MIN=2 MAX, = 12 WEIGHTED TOTAL= 20 WHAT =, 12 TEST =
1000";

our %results;

m!
(?&RECURSE)

(?(DEFINE)
(?<BAD_ASSIGNMENT> (?&BAD_LVSIDE)|(?&BAD_RVSIDE) )
(?<BAD_LVSIDE> (?&P_PLUS) = (?&P_STAR) )
(?<BAD_RVSIDE> (?&P_STAR) = (?&P_PLUS) )
(?<P_PLUS> \s*\p{P}+\s* )
(?<P_STAR> \s*\p{P}*\s* )
(?<SKIP_AND_RECURSE> (?&BAD_ASSIGNMENT) \w++ \s*+ (?&END) )
(?<END> (?&RECURSE) | \Z )
(?<KEYWORD_SEP> [\s\p{P}]*+ )
(?<RECURSE>
(?<keyword> \w++ ) #Grab a keyword
(?(?=(?&BAD_ASSIGNMENT)) #If it's followed by a malformed assignment,
(?&SKIP_AND_RECURSE) #Grab the right side and skip forward.
| #Otherwise,
(?: \s* = \s* (?<value> \w+ ) )? #If there's an assignment, grab the value.
(?{
$results{ $+{keyword} } = $+{value} // 1 #Get the results,
})
) (?&KEYWORD_SEP) #Eat up any whitespace/punctuation until the next keyword,
(?&END) #And recurse. If we are at the end of the string, we are done.
)
)
!x;

say Dumper \%results;
(Wow, gmail screwed up my indentation there. Here: http://ideone.com/vFSLc)

But really you shouldn't; There is no need to do it in a single expression,
other than causing your maintenance programmer some headache. And if doing
it in one go is a requirement for whatever reason, using Regexp::Grammars[1]
instead is probably a much better solution.

Brian.

[0]
http://search.cpan.org/~dconway/Regexp-Grammars-1.012/lib/Re gexp/Grammars.pm


On Fri, Mar 18, 2011 at 7:24 PM, Chap Harrison <clh [at] pobox.com> wrote:

>
> This raises the question: is it possible, using a recursive regex and s///
> or m//, to - in one fell swoop - identify a correctly-formed string AND save
> the key=value pairs into a hash?
>
> Eons ago, when writing a SNOBOL4 program to match an arbitrary algebraic
> expression, I tried to make it evaluate the expression as it "unwound." I
> never succeeded -- so it's become something of a holy grail for me. Not a
> highly important one, though ;-)
>
>
>
> --
> To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
> For additional commands, e-mail: beginners-help [at] perl.org
> http://learn.perl.org/
>
>
>

--00235453097036950d049ecdd4e4--
Brian Fraser [ Sa, 19 März 2011 04:52 ] [ ID #2056867 ]

Re: Parse Key=Val parameters with s///eg

On Mar 18, 2011, at 10:52 PM, Brian Fraser wrote:

> There is no need to do it in a single expression, other than causing =
your maintenance programmer some headache.

Yes, that was certainly a challenge to follow!

Might I conclude, then, that a lot of those Perl extensions are of =
dubious value, if one takes maintainability into account? Or are there =
situations in which they're considered essential? (I realize that's a =
matter of opinion, but I am curious to learn when or whether it's =
advisable to have those constructs in my "toolbox" at all.)

Chap=

--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
Chap Harrison [ Sa, 19 März 2011 19:39 ] [ ID #2056874 ]
Perl » gmane.comp.lang.perl.beginners » Parse Key=Val parameters with s///eg

Vorheriges Thema: File/Slurp module
Nächstes Thema: printf explicitly