putting file columns into arrays
Dear All,
I would like a subroutine that will allow me to easily put columns of a =
tab delimited file into their own arrays.
I've been calling the following repeatedly for each column:
my [at] array1 =3D getcolvals($filehandle, 0);
my [at] array2 =3D getcolvals($filehandle, 1); ...etc.
sub getcolvals {
[at] _ and not [at] _ % 2 or die "Incorrect number of arguments to =
getcolvals!\n";
my $myfile =3D shift;
my $mycol =3D shift;
=09
my [at] column =3D ();
=09
while (<$myfile>) {
my ($field) =3D (split /\s/, $_)[$mycol];
push [at] column, $field;
}
return [at] column;
}
This accomplishes exactly what I want, but it requires going through the =
whole file for each column extraction which seems inefficient. Also, I =
want to know if I can modify the subroutine to return all the (arbitrary =
number of) columns at once into arrays. Any suggestions?
Many thanks,
Eric=
--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
Re: putting file columns into arrays
>>>>> "EM" == Eric Mooshagian <ericmooshagian [at] gmail.com> writes:
EM> I would like a subroutine that will allow me to easily put columns
EM> of a tab delimited file into their own arrays.
EM> I've been calling the following repeatedly for each column:
EM> my [at] array1 = getcolvals($filehandle, 0);
EM> my [at] array2 = getcolvals($filehandle, 1); ...etc.
whenever you think you need to name things with numeric parts, you
usually need an array. since you want arrays, then you really want an
array of arrays.
EM> sub getcolvals {
EM> [at] _ and not [at] _ % 2 or die "Incorrect number of arguments to getcolvals!\n";
that is sort of clunky. why not just check [at] _ == 2?
[at] _ == 2 or die ...
EM> my $myfile = shift;
EM> my $mycol = shift;
it is usually better to assign from [at] _. i posted not to long ago several
reasons why. check the archives for it.
my( $myfile, $mycol ) = [at] _ ;
and in this case you won't need a $mycol since the code will load all
the columns into arrays.
EM> my [at] column = ();
you don't need to initialize my arrays to () as my does that for you.
EM> while (<$myfile>) {
this will fail unless you reopen the file each time you call the sub or
you seek to the beginning of the file.
EM> my ($field) = (split /\s/, $_)[$mycol];
since you are slicing the split and getting one value, you don't need
the () around $field.
EM> push [at] column, $field;
and you can combing both of those lines into one:
push [at] column, (split /\s/, $_)[$mycol] ;
EM> }
EM> return [at] column;
EM> }
this is untested:
# this is a faster and easier way to get lines from a file
use File::Slurp ;
sub load_columns {
my( $file_name ) = [at] _ ;
$file_name or die 'load_columns: missing file name' ;
my [at] lines = read_file $file_name ;
my $matrix ;
foreach my $line ( [at] lines ) {
my [at] fields = split ' ', $line ;
for my $i ( 0 .. $#fields ) {
# build up the array of arrays here. each array gets the next field value
push( [at] {$matrix[$i]}, $field[$i] ) ;
}
}
return $matrix ;
}
for more on references and perl data structures read:
perlreftut
perllol
perldsc
uri
--
Uri Guttman ------ uri [at] stemsystems.com -------- http://www.sysarch.com --
----- Perl Code Review , Architecture, Development, Training, Support ------
--------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/