Please excuse the NOOB question - but I guess since this is a

Okay;

I am sure that someone out there has done this before - I *think* I am
on the right track.

I have a directory full of emails. What I would like to do is read
each file in, then parse them into a CSV style file.

Example:


#!/usr/bin/perl

use warnings;
use strict;

open FILE , "/home/gmillard/SentMail/YourSatSetup.txt" or die $!;
my $linenum =1;

while (<FILE>) {
print "|", $linenum++;
print"$_" ;
}

Produces the following.

|1From - Sun Feb 21 11:40:01 2010
|2X-Mozilla-Status: 0001
|3X-Mozilla-Status2: 00000000
|4X-Gmail-Received: 58fa0ec68ca9975c1d187ceadc0ad3aeb1026134
|5Received: by 10.48.212.6 with HTTP; Fri, 17 Nov 2006 12:52:26 -0800
(PST)
|6Message-ID:
<234ff75a0611171252x3ea2facdw55cd81ec3a185926 [at] mail.gmail.com>
|7Date: Fri, 17 Nov 2006 15:52:26 -0500
|8From: "xxxxxxxxxxxxxxxxxxxxxxxx>
|9To: xxxxxx [at] bell.blackberry.net
|10Subject: Your satellite set up. . From an article that i read.
|11MIME-Version: 1.0
|12Content-Type: text/plain; charset=ISO-8859-1; format=flowed
|13Content-Transfer-Encoding: 7bit
|14Content-Disposition: inline
|15Delivered-To: xxxxxxxxxxxxxxxxxxxxxx
|16
|17Hi Andrew;
|18I read an article about you a while back about your MythTV and VOip
|19setup. Would you mind if i asked you some tech questions ? I am
very
|20intrigued.
|21Thanks
|22Glen xxxxxxxxxx
|23xxxxxxxxxxxxx


I have hundreds of emails in this directory. I would like to parse
them into a single file where each comma separated/tab separated field
is a line from the email.

So, the first line of the CSV file is
|1From - Sun Feb 21 11:40:01 2010|2X-Mozilla-Status: 0001|3X-Mozilla-
Status2: 00000000|4X-Gmail-Received:
58fa0ec68ca9975c1d187ceadc0ad3aeb1026134
<truncated>

and each subsequent line is the next email and so forth.

Any words of wisdom?

Thanks much.

Glen


--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
GlenM [ Do, 25 Februar 2010 20:03 ] [ ID #2033563 ]

Re: Please excuse the NOOB question - but I guess since this is aperl.beginners group

glen,

you posted this to comp.lang.perl.misc and got plenty of help
there. again, your subject line has nothing to do with the actual
request. please learn to use these resources properly with good
subjects (state the technical problem, not a begging for help). also why
did you ask for more help when the problem was mostly solved already?

uri

--
Uri Guttman ------ uri [at] stemsystems.com -------- http://www.sysarch.com --
----- Perl Code Review , Architecture, Development, Training, Support ------
--------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------

--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
Uri Guttman [ Fr, 26 Februar 2010 16:32 ] [ ID #2033564 ]

AW: Please excuse the NOOB question - but I guess since this is a perl.beginners group

Hi,

GlenM <glenmillard [at] gmail.com> wrote:
> I am sure that someone out there has done this before - I *think* I am
> on the right track.
>
> I have a directory full of emails. What I would like to do is read
> each file in, then parse them into a CSV style file.

Quick aside: you can use the $. builtin variable to get the line number =
instead of keeping track of it yourself.

Since you asked for suggestions, here's my $0.02:

Putting mails line by line into a CSV files seems futile to me, since =
the number and position of mail headers is variable.

If you want to create a database of your mails or do some other sort of =
data mining I would suggest that you look at something like the =
MIME-Tools bundle[1] to do the actual parsing and the concentrate on =
doing something useful with the extracted information.

HTH,
Thomas

1) http://search.cpan.org/~doneill/MIME-tools-5.427/lib/MIME/To ols.pm


--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
t.baetzler [ Fr, 26 Februar 2010 16:36 ] [ ID #2033565 ]

Re: Please excuse the NOOB question - but I guess since this is aperl.beginners group

GlenM wrote:
> Okay;
>
> I am sure that someone out there has done this before - I *think* I am
> on the right track.
>
> I have a directory full of emails. What I would like to do is read
> each file in, then parse them into a CSV style file.
>
> Example:
>
>
> #!/usr/bin/perl
>
> use warnings;
> use strict;
>
> open FILE , "/home/gmillard/SentMail/YourSatSetup.txt" or die $!;
> my $linenum =1;
>
> while (<FILE>) {
> print "|", $linenum++;
> print"$_" ;
> }

>
> So, the first line of the CSV file is
> |1From - Sun Feb 21 11:40:01 2010|2X-Mozilla-Status: 0001|3X-Mozilla-
> Status2: 00000000|4X-Gmail-Received:
> 58fa0ec68ca9975c1d187ceadc0ad3aeb1026134
> <truncated>

Use opendir/readdir, an array and join(), for example something like this:

....
my [at] csv;
opendir(my $dfh, '/home/gmillard/SentMail') or die($!);
while((my $fname = readddir($dfh))) {
# ignore unwanted files
next if($fname =~ /^\./o || $fname !~ /\.txt$/oi);

my $fullname = '/home/gmillard/SentMail/' . $fname;

# Ignore non-files
next if(!(-f $fullname));

my [at] columns;
# Open file and read it into [at] colunms
open(my $fh, '<', $fullname) or die($!);
my $linecnt = 1;
while((my $line = <$fh>) {
# Add every line to [at] columns
chomp $line;
push [at] columns, "$linecnt$line";
$linecnt++;
}
close($fh);
# join the whole [at] columns into a single scalar and
# push that into [at] csv
push [at] csv, join('|', [at] columns);
}
closedir($dfh);

# Output data to file
open(my $ofh, '>', 'output.csv') or die($!);
foreach my $csvline ( [at] csv) {
print $ofh "$csvline\n";
}
close($ofh);
....


This reads the whole lot into memory (which is fast if you got enough).
If you have a BIG Mailfolder, you might have to think of a slightly
different strategy.

Also, your CSV layout will brake if even a single mail contains a pipe
sign. I leave it up to the reader to find a workaround (typing "perldoc
perlre" at the command prompt might give you a hint, though).

LG
Rene

--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
Rene Schickbauer [ Fr, 26 Februar 2010 16:38 ] [ ID #2033566 ]
Perl » gmane.comp.lang.perl.beginners » Please excuse the NOOB question - but I guess since this is a

Vorheriges Thema: syslog filters
Nächstes Thema: Math::GMP