text encoding
----ALT--c4VZP3UO1297175996
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: 8bit
Hello,
How to know that the given text is with which encoding method?
for example, is it ISO-8859-1, UTF-8, KOI8-U, or others?
Thanks.
----ALT--c4VZP3UO1297175996--
Re: text encoding
On Tuesday 08 Feb 2011 16:39:56 terry peng wrote:
> Hello,
>
> How to know that the given text is with which encoding method?
> for example, is it ISO-8859-1, UTF-8, KOI8-U, or others?
>
I recall finding a CPAN module or two for that:
* http://search.cpan.org/dist/Text-GuessEncoding/
* http://search.cpan.org/dist/Encode-Detect/
I should note that guessing an encoding is not reliable - an 8-bit text can be
ISO-8859-1 but it might as well be ISO-8859-2 etc. and 7-bit ASCII can is
upwards compatible with all of them and with UTF-8 and you can only find one
is the case by finding common patterns of usage.
Regards,
Shlomi Fish
> Thanks.
--
------------------------------------------------------------ -----
Shlomi Fish http://www.shlomifish.org/
Apple Inc. is Evil - http://www.shlomifish.org/open-source/anti/apple/
Chuck Norris can make the statement "This statement is false" a true one.
Please reply to list if it's a mailing list post - http://shlom.in/reply .
--
To unsubscribe, e-mail: beginners-unsubscribe [at] perl.org
For additional commands, e-mail: beginners-help [at] perl.org
http://learn.perl.org/