csv split, field with embedded comma

I've got what I'm sure is an old problem, but my perl has gotten rusty
since I originally wrote these scripts.

I have a set of scripts that process various CSV files to generate
reports from a Unix based POS system. Recently a few sites started
reporting problems with strange results, and after some investigation
I discovered that in one user defined string field in a few records,
someone had been entering strings that contained commas. With my
simplistic split on commas, this shifted all of the fields over by
one, royally screwing up the reports.

The field in question is enclosed in quotes, so I believe I should be
able to work around this by modifying the split orwith a regex to
substitute out the comma in the quote delimited field, but a simple
solution is escaping me.

Has anyone got a quick fix?

Thanks!
Mark Tutt [ Do, 23 Juni 2005 20:59 ] [ ID #850872 ]

Re: csv split, field with embedded comma

Mark Tutt wrote:
> I've got what I'm sure is an old problem, but my perl has gotten rusty
> since I originally wrote these scripts.
>
> I have a set of scripts that process various CSV files to generate
> reports from a Unix based POS system. Recently a few sites started
> reporting problems with strange results, and after some investigation
> I discovered that in one user defined string field in a few records,
> someone had been entering strings that contained commas. With my
> simplistic split on commas, this shifted all of the fields over by
> one, royally screwing up the reports.
>
> The field in question is enclosed in quotes, so I believe I should be
> able to work around this by modifying the split orwith a regex to
> substitute out the comma in the quote delimited field, but a simple
> solution is escaping me.
>
> Has anyone got a quick fix?

Hello Mark.

This Question is Asked Frequently, and so the answer comes
pre-installed with standard distributions of Perl. You can read the
answer by examining the Perl FAQ, by typing:
perldoc -q delimited
at your console.

There are a couple different suggestions contained therein. One is a
modification to your "split" to use a relatively complex regexp
instead. The other is to use an external module. Either Text::CSV
(from CPAN) or Text::ParseWords (standard) should do nicely.

Paul Lalli
Paul Lalli [ Do, 23 Juni 2005 21:21 ] [ ID #850873 ]

Re: csv split, field with embedded comma

On Thu, 23 Jun 2005 18:59:10 +0000, Mark Tutt wrote:

> I've got what I'm sure is an old problem, but my perl has gotten rusty
> since I originally wrote these scripts.
>
> I have a set of scripts that process various CSV files to generate
> reports from a Unix based POS system. Recently a few sites started
> reporting problems with strange results, and after some investigation
> I discovered that in one user defined string field in a few records,
> someone had been entering strings that contained commas. With my
> simplistic split on commas, this shifted all of the fields over by
> one, royally screwing up the reports.
>
> The field in question is enclosed in quotes, so I believe I should be
> able to work around this by modifying the split orwith a regex to
> substitute out the comma in the quote delimited field, but a simple
> solution is escaping me.
>
> Has anyone got a quick fix?

Don't use split. Instead use Text::ParseWords. It's a standard part of the
Perl distribution.

Dave...
Dave Cross [ Sa, 25 Juni 2005 09:19 ] [ ID #853639 ]
Perl » alt.perl » csv split, field with embedded comma

Vorheriges Thema: Image Tracker
Nächstes Thema: Batch processing