genbank library file

--0016361e816a2593bb0499f9dbb7
Content-Type: text/plain; charset=ISO-8859-1

Hi,

I'm trying to create a genbank library file that contains several genbank
records. I read in the genbank record names from a separate file into an
array and then loop through array of file names, open each file and read
contents into another array. The problem is in looping through the array and
opening individual files. The code is below. I successfully populate [at] ids
but cannot manage to loop through [at] ids, open files and populate [at] library as
I get "can't open file !\n"; messages. Please advise at your convenience,
Many thanks,

galeb

#!/usr/bin/perl
# create_gb_library.pl
use strict; use warnings;

open IDFILE, shift or die "can't read idfile!\n";

my [at] ids = '';

while( <IDFILE> ) {
my $filename = $_;
push( [at] ids, $filename );
}

my [at] library;

for my $id( [at] ids ) {
chomp $id;
open FILE, $id or die "can't open file $id!\n";
$/ = "//\n"; # input-record separator is genbank end-of-record
separator
my $record = <FILE>;
push( [at] library, $record)
}
close FILE;

open OUTLIBRARY, ">gblibrary.txt" or die "can't open gblibrary.txt!\n";

print OUTLIBRARY [at] library, "\n";

close OUTLIBRARY;

--0016361e816a2593bb0499f9dbb7--
galeb abu-ali [ So, 16 Januar 2011 18:18 ] [ ID #2053257 ]

Re: genbank library file

First a few notes about the code.

On Sun, Jan 16, 2011 at 12:18 PM, galeb abu-ali <abualiga2 [at] gmail.com> wrote=
:
> #!/usr/bin/perl
> # create_gb_library.pl
> use strict; use warnings;
>
> open IDFILE, shift or die "can't read idfile!\n";

You should probably add some error checking here to make sure that you
get the right arguments passed. It will also make the code easier to
follow. You should also use lexical file handles and the 3-argument
open: see "perldoc -f open" for details. The two arguent version that
you're using interprets the second argument in special ways that could
result in executing a process instead of opening a file.

For example:

# Assuming the program should always have exactly 1 argument:
die "Usage: $0 id_file" unless [at] ARGV =3D=3D 1;

# Store the file name in a variable for clarity.
my $id_file_name =3D shift [at] ARGV;

# Use 3-argument open, with a lexical file handle.
# Also output $! in die message so the user knows
# what's wrong.
open my $id_fh, '<', $id_file_name or
die "Failed to open id file '$id_file_name': $!";

> my [at] ids =3D '';
>
> while( <IDFILE> ) {
> =C2=A0 =C2=A0my $filename =3D $_;

This would be cleaner if the file name was declared on the while line:

while(my $filename =3D <$id_fh>)
{

> =C2=A0 =C2=A0push( [at] ids, $filename );
> }
>
> my [at] library;
>
> for my $id( [at] ids ) {

If $id is a file name then a better name would probably be
$id_file_name, but then we already have said variable from above. Are
these the same types of files or perhaps the names are ambiguous? For
now, we might as well us $filename, as we did reading them in.

for my $filename ( [at] ids)
{

> =C2=A0 =C2=A0chomp $id;
> =C2=A0 =C2=A0open FILE, $id or die "can't open file $id!\n";

Again, better with the 3-argument open, a lexical file handle, and
outputting $!:

open my $fh, '<', $filename or
die "Failed to open file '$filename': $!";

> =C2=A0 =C2=A0$/ =3D "//\n"; =C2=A0 =C2=A0# input-record separator is genb=
ank end-of-record
> separator

I think that you should limit the scope of that so that its affects
are temporary.

local $/ =3D "//\n";

> =C2=A0 =C2=A0my $record =3D <FILE>;

my $record =3D <$fh>;

> =C2=A0 =C2=A0push( [at] library, $record)
> }

> close FILE;

If you're going to explicitly close this file (IIRC, a lexical file
handle will close it automatically when it goes out of scope) then you
should probably do it at the end of the loop where it was opened.
Also, close can fail (though I think unlikely here) so you might as
well check for success and warn.

close $fh or warn "Failed to close file '$filename': $!";
}

> open OUTLIBRARY, ">gblibrary.txt" or die "can't open gblibrary.txt!\n";

The 3-arg open might be preferred here too and for certain a lexical
file handle would still be preferred. Again, you should output the $!
variable if it fails:

my $out_file_name =3D 'gblibrary.txt';
my open $out_fh, '>', $out_file_name or
die "Failed to open output file '$out_file_name': $!";

> print OUTLIBRARY [at] library, "\n";

I'm not sure if it's necessary, but with lexical file handles I always
enclose the handle in a block to be sure Perl knows what I mean:

print { $out_fh } [at] library, "\n";

> close OUTLIBRARY;

Again, I think the lexical file handle should automatically close this
file, but if we're going to close it we might as well be explicit
about it.

close $out_fh or
die "Failed to close output file '$out_file_name': $!";

Now on to why it isn't working.

> opening individual files. The code is below. I successfully populate
> [at] ids but cannot manage to loop through [at] ids, open files and
> populate [at] library as I get "can't open file !\n"; messages. Please
> advise at your convenience,

How do you know that you successfully populated [at] ids with the correct
data? Have you output it to see? To be certain, you can use
Data::Dumper to output the raw array.

use Data::Dumper;

print Dumper \ [at] ids;

The message "can't open file !\n" suggests that your file name
variable is empty, which would explain why opening fails. So confirm
what is in the array. You can also dump the value of the file name
variable:

print Dumper \$filename;

Look up Data::Dumper on CPAN for detailed usage (you could try perldoc
too: perldoc Data::Dumper). I would guess that the problem is the [at] ids
array not containing what you think it does. Outputting the $!
variable in your die calls should also help to identify problems,
though it looks like it won't add anything extra this time.

--
Brandon McCaig <http://www.bamccaig.com> <bamccaig [at] gmail.com>
V zrna gur orfg jvgu jung V fnl. Vg qbrfa'g nyjnlf fbhaq gung jnl.
Castopulence Software <http://www.castopulence.org/> <bamccaig [at] castopulence=
..org>

--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
Brandon McCaig [ So, 16 Januar 2011 19:58 ] [ ID #2053258 ]

Re: genbank library file

galeb abu-ali wrote:
> Hi,

Hello,

> I'm trying to create a genbank library file that contains several genbank
> records. I read in the genbank record names from a separate file into an
> array and then loop through array of file names, open each file and read
> contents into another array. The problem is in looping through the array and
> opening individual files. The code is below. I successfully populate [at] ids
> but cannot manage to loop through [at] ids, open files and populate [at] library as
> I get "can't open file !\n"; messages. Please advise at your convenience,
> Many thanks,
>
> galeb
>
> #!/usr/bin/perl
> # create_gb_library.pl
> use strict; use warnings;
>
> open IDFILE, shift or die "can't read idfile!\n";
>
> my [at] ids = '';

Why are you assigning '' to $ids[0]?


> while(<IDFILE> ) {
> my $filename = $_;

That is usually written as:

while ( my $filename = <IDFILE> ) {

You should also use chomp here.

chomp $filename;


> push( [at] ids, $filename );
> }

Perhaps you should just do that like this instead:

open IDFILE, '<', shift or die "can't read idfile because: $!";

chomp( my [at] ids = <IDFILE> );


> my [at] library;
>
> for my $id( [at] ids ) {
> chomp $id;
> open FILE, $id or die "can't open file $id!\n";

The first entry in [at] ids is '', which you assigned yourself, and which
can never be a valid file name, hence the error message "can't open file
!\n". Notice the space between 'file' and '!'.

Do the file names you get use absolute paths or relative paths? Is your
program's current working directory the place where these files exist or
are they in another directory? Do you need to prepend a directory name
in order to access them?



John
--
Any intelligent fool can make things bigger and
more complex... It takes a touch of genius -
and a lot of courage to move in the opposite
direction. -- Albert Einstein

--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
jwkrahn [ So, 16 Januar 2011 20:41 ] [ ID #2053259 ]
Perl » gmane.comp.lang.perl.beginners » genbank library file

Vorheriges Thema: New Document: "How to Start Contributing to or Using Open Source Software"
Nächstes Thema: doubt in substring