how to sort in perl where the key is the 2nd of 3 components of a
--000feaee0dc3a5b738048c0672cb
Content-Type: text/plain; charset=ISO-8859-1
Hi;
I have a legacy Perl script that I need to modify.
The current output is a set of zero or more space separated strings where
each substring is a string of of three dash-separated substrings.
I want to sort these space separated strings based on the middle of the
dash-separated substring.
The regex for the space-separated sub string is not exactly, but close to
the following regex.
[a-z0-9][a-z0-9]+-[A-Z][A-Z][A-Z]0z0r[1-9a-z][1-9]+-[c-z][a- z][1-9]
Here is what I tried with awk, tr and sort:
echo "example matching long string of real world data" | tr ' ' '\n' | awk
-F\- '{print $1, $2, $3}' | sort -d -k2 | tr ' ' '-' | tr '\n' ' '
This does what is intended although I'd sure like to be able to simplify
that and have not figured out how to do it.
So should I do the same thing in Perl (ie: accumulate in an array with push,
then split, then sort?, then print)?
Or is there a more direct way?
I need to keep the Perl is simple and as maintainable as possible.
Thanks in advance for your advice.
Ken Wolcott
kennethwolcott [at] gmail.com
--000feaee0dc3a5b738048c0672cb--
Re: how to sort in perl where the key is the 2nd of 3 components of astring where a dash is the sepa
>>>>> "KW" == Kenneth Wolcott <kennethwolcott [at] gmail.com> writes:
KW> The current output is a set of zero or more space separated
KW> strings where each substring is a string of of three
KW> dash-separated substrings.
you should always show sample input data and expected output (sorted
data). just describing it even when simple can be different than the
actual data.
KW> I want to sort these space separated strings based on the middle of the
KW> dash-separated substring.
KW> The regex for the space-separated sub string is not exactly, but close to
KW> the following regex.
KW> [a-z0-9][a-z0-9]+-[A-Z][A-Z][A-Z]0z0r[1-9a-z][1-9]+-[c-z][a- z][1-9]
blech. if you really have dash seperators, use split. whenever you think
separartor think split.
KW> Here is what I tried with awk, tr and sort:
and why would that matter in a perl question? keep on subject, how to
sort your data in perl
KW> So should I do the same thing in Perl (ie: accumulate in an array
KW> with push, then split, then sort?, then print)?
basically that is correct. there are various ways to handle that
too. one perl classic way is the schwartzian transform. a faster variant
is the GRT. you can learn how to code them from the Sort::Maker module
which can generate sort code in 4 styles. all you need to do is describe
how to get the key from the record. split makes that easy:
(split /-/, $record)[1]
that gets the middle field from a record.
KW> I need to keep the Perl is simple and as maintainable as possible.
use that to start and either use the module and see what it generates or
google for those techniques and see what you can do. this is not
considered a complex sort at all. code up what you can and show it and
ask here for more help if you need.
uri
--
Uri Guttman ------ uri [at] stemsystems.com -------- http://www.sysarch.com --
----- Perl Code Review , Architecture, Development, Training, Support ------
--------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
Re: how to sort in perl where the key is the 2nd of 3 components of a
On Fri, Jul 23, 2010 at 7:21 AM, Uri Guttman <uri [at] stemsystems.com> wrote:
>>>>>> "KW" =3D=3D Kenneth Wolcott <kennethwolcott [at] gmail.com> writes:
>
> =A0KW> =A0 The current output is a set of zero or more space separated
> =A0KW> strings where each substring is a string of of three
> =A0KW> dash-separated substrings.
>
> you should always show sample input data and expected output (sorted
> data). just describing it even when simple can be different than the
> actual data.
>
> =A0KW> I want to sort these space separated strings based on the middle o=
f the
> =A0KW> dash-separated substring.
>
> =A0KW> The regex for the space-separated sub string is not exactly, but c=
lose to
> =A0KW> the following regex.
>
> =A0KW> [a-z0-9][a-z0-9]+-[A-Z][A-Z][A-Z]0z0r[1-9a-z][1-9]+-[c-z][a- z][1-9=
]
>
> blech. if you really have dash seperators, use split. whenever you think
> separartor think split.
>
> =A0KW> Here is what I tried with awk, tr and sort:
>
> and why would that matter in a perl question? keep on subject, how to
> sort your data in perl
>
> =A0KW> So should I do the same thing in Perl (ie: accumulate in an array
> =A0KW> with push, then split, then sort?, then print)?
>
> basically that is correct. there are various ways to handle that
> too. one perl classic way is the schwartzian transform. a faster variant
> is the GRT. you can learn how to code them from the Sort::Maker module
> which can generate sort code in 4 styles. all you need to do is describe
> how to get the key from the record. split makes that easy:
>
> =A0 =A0 =A0 =A0(split /-/, $record)[1]
>
> that gets the middle field from a record.
>
> =A0KW> I need to keep the Perl is simple and as maintainable as possible.
>
> use that to start and either use the module and see what it generates or
> google for those techniques and see what you can do. this is not
> considered a complex sort at all. code up what you can and show it and
> ask here for more help if you need.
>
> uri
>
> --
> Uri Guttman =A0------ =A0uri [at] stemsystems.com =A0-------- =A0http://www.sy=
sarch.com --
> ----- =A0Perl Code Review , Architecture, Development, Training, Support =
------
> --------- =A0Gourmet Hot Cocoa Mix =A0---- =A0http://bestfriendscocoa.com=
---------
>
> --
> To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
> For additional commands, e-mail: beginners-help [at] perl.org
> http://learn.perl.org/
>
>
>
For doing the Sort, you may consider the schwartzian_transform
sub schwartzian_transform(& [at] ) {
my $compute =3D shift;
return map { $_->[1] }
sort { $a->[0] cmp $b->[0] }
map { [ $compute->(), $_ ] } [at] _;
}
schwartzian_transform { (split /-/, $_)[1] } [at] array;
More information on:
http://sites.google.com/site/oleberperlrecipes/recipes/01-va riables/02-arra=
ys/03-sort-arrays?pli=3D1
Best regards
Marcos Rebelo
--
Marcos Rebelo
http://oleber.freehostia.com
Milan Perl Mongers leader http://milan.pm.org
Webmaster of http://sites.google.com/site/oleberperlrecipes/
--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
Re: how to sort in perl where the key is the 2nd of 3 componentsof a string where a dash is the sepa
marcos rebelo wrote:
> For doing the Sort, you may consider the schwartzian_transform
>
> sub schwartzian_transform(& [at] ) {
> my $compute = shift;
> return map { $_->[1] }
> sort { $a->[0] cmp $b->[0] }
> map { [ $compute->(), $_ ] } [at] _;
> }
>
> schwartzian_transform { (split /-/, $_)[1] } [at] array;
>
> More information on:
> http://sites.google.com/site/oleberperlrecipes/recipes/01-va riables/02-arrays/03-sort-arrays?pli=1
That subroutine has a much too general name, because it only implements
one way to create a temporary index (AKA Schwartzian Transform).
See also http://en.wikipedia.org/wiki/Schwartzian_transform
and of course Sort::Maker. http://search.cpan.org/search?q=Sort::Maker
--
Ruud
--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/