text encoding

----ALT--c4VZP3UO1297175996
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: 8bit

Hello,

How to know that the given text is with which encoding method?
for example, is it ISO-8859-1, UTF-8, KOI8-U, or others?

Thanks.

----ALT--c4VZP3UO1297175996--
terry peng [ Di, 08 Februar 2011 15:39 ] [ ID #2054801 ]

Re: text encoding

On Tuesday 08 Feb 2011 16:39:56 terry peng wrote:
> Hello,
>
> How to know that the given text is with which encoding method?
> for example, is it ISO-8859-1, UTF-8, KOI8-U, or others?
>

I recall finding a CPAN module or two for that:

* http://search.cpan.org/dist/Text-GuessEncoding/

* http://search.cpan.org/dist/Encode-Detect/

I should note that guessing an encoding is not reliable - an 8-bit text can be
ISO-8859-1 but it might as well be ISO-8859-2 etc. and 7-bit ASCII can is
upwards compatible with all of them and with UTF-8 and you can only find one
is the case by finding common patterns of usage.

Regards,

Shlomi Fish

> Thanks.

--
------------------------------------------------------------ -----
Shlomi Fish http://www.shlomifish.org/
Apple Inc. is Evil - http://www.shlomifish.org/open-source/anti/apple/

Chuck Norris can make the statement "This statement is false" a true one.

Please reply to list if it's a mailing list post - http://shlom.in/reply .

--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/
Shlomi Fish [ Di, 08 Februar 2011 16:12 ] [ ID #2054802 ]
Perl » gmane.comp.lang.perl.beginners » text encoding

Vorheriges Thema: net::snmp and fetching extend data
Nächstes Thema: custom module and lib