Follow up on myODBC 3.51.10 and UTF-8 with mySQL 4.1.8

Config :
mySQL 4.1.8 - UTF-8 charset by default for all DB/Table/Fields as well
as collations
myODBC 3.51.10
All this is used in ASP pages, or in VB/VBScript programs.

Summary of the problem :
We have a huge amount of data, in a text file, that we want to import
into mySQL for easier management.
The format of the text file does not allow to use a parser or anything
easy to gather datas from it, and therefore we need to read it
sequentially.
The problem is, the datas can be in several languages and is in UTF-8.
The ASP FileSystemObject cannot handle the file because it's not
Unicode UCS-2 but UTF-8.
So we decided to use the LOAD DATA INFILE option to load the datas in
mySQL, and then use a recordset to browse them and do our own parsing.
The problem is that, in certain languages, the datas are corrupted
when we read them (and then rewrite them into a clean database
structure), and we don't know the exact reason why. Our guess is that
it may be caused by the old text protocol used in myODBC 3.51, because
all our ASP pages use the UTF-8 Codepage declaration, and when we look
at the file imported with mySQL Administrator for instance (which
seems to use the new binary protocol), the datas are valid in the huge
unformated table.

Examples of valid words even with "special characters" :
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D
Fran=C3=A7ais
Vi=C3=B0skipti

Examples of erronous words with "special characters":
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=
=3D=3D=3D
C=C3=A4m=C4=9Fi=C3=A4t
A=C5=9Fsu
M=C5=B1v=C3=A9szet =C3=A9s kult=C3=BAra

We are posting this follow up in the myODBC list because we would like
to know if we need to wait for a major release of myODBC or if this
could be fixed in a minor one, in the case it is not caused by
connection protocol issues but an error in extended characters
handling, or whatever. If this could not be fixed before a major
release, what would be our solutions to avoid this problem ? Use
another connector, and thus another platform than ASP or VBSCript ?

Thank you for your reply
CheHax

--
MySQL ODBC Mailing List
For list archives: http://lists.mysql.com/myodbc
To unsubscribe: http://lists.mysql.com/myodbc?unsub=3Dgcdmo-myodbc [at] m.gmane.o rg
HMax [ Mo, 17 Januar 2005 12:28 ] [ ID #590676 ]

Re: Follow up on myODBC 3.51.10 and UTF-8 with mySQL 4.1.8

HMax,

MyODBC 3.51.10 can work with MySQL 4.1 but does not take advantage of
new protocol features of MySQL 4.1. MyODBC 3.53 will take advantage of
new features in MySQL 4.1 and 5.x.

HMax wrote:

>Config :
>mySQL 4.1.8 - UTF-8 charset by default for all DB/Table/Fields as well
>as collations
>myODBC 3.51.10
>All this is used in ASP pages, or in VB/VBScript programs.
>
>Summary of the problem :
>We have a huge amount of data, in a text file, that we want to import
>into mySQL for easier management.
>The format of the text file does not allow to use a parser or anything
>easy to gather datas from it, and therefore we need to read it
>sequentially.
>The problem is, the datas can be in several languages and is in UTF-8.
>The ASP FileSystemObject cannot handle the file because it's not
>Unicode UCS-2 but UTF-8.
>So we decided to use the LOAD DATA INFILE option to load the datas in
>mySQL, and then use a recordset to browse them and do our own parsing.
>The problem is that, in certain languages, the datas are corrupted
>when we read them (and then rewrite them into a clean database
>structure), and we don't know the exact reason why. Our guess is that
>it may be caused by the old text protocol used in myODBC 3.51, because
>all our ASP pages use the UTF-8 Codepage declaration, and when we look
>at the file imported with mySQL Administrator for instance (which
>seems to use the new binary protocol), the datas are valid in the huge
>unformated table.
>
>Examples of valid words even with "special characters" :
>========================================================
>Français
>Viðskipti
>
>Examples of erronous words with "special characters":
>============================
>Cämğiät
>Aşsu
>Művészet és kultúra
>
>We are posting this follow up in the myODBC list because we would like
>to know if we need to wait for a major release of myODBC or if this
>could be fixed in a minor one, in the case it is not caused by
>connection protocol issues but an error in extended characters
>handling, or whatever. If this could not be fixed before a major
>release, what would be our solutions to avoid this problem ? Use
>another connector, and thus another platform than ASP or VBSCript ?
>
>Thank you for your reply
>CheHax
>
>
>


--
Peter Harvey, Software Developer
MySQL AB, www.mysql.com

Are you MySQL certified? www.mysql.com/certification



--
MySQL ODBC Mailing List
For list archives: http://lists.mysql.com/myodbc
To unsubscribe: http://lists.mysql.com/myodbc?unsub=gcdmo-myodbc [at] m.gmane.org
pharvey [ Fr, 21 Januar 2005 16:52 ] [ ID #599932 ]

Re: Follow up on myODBC 3.51.10 and UTF-8 with mySQL 4.1.8

So are the Charset and UTF-8 functionnalies linked and dependant on
the new protocol ? And do you have an approximate date of release for
a 3.53 version ?

Thanks


On Fri, 21 Jan 2005 07:52:34 -0800, Peter Harvey <pharvey [at] mysql.com> wrote:
> HMax,
>
> MyODBC 3.51.10 can work with MySQL 4.1 but does not take advantage of
> new protocol features of MySQL 4.1. MyODBC 3.53 will take advantage of
> new features in MySQL 4.1 and 5.x.
>
> HMax wrote:
>
> >Config :
> >mySQL 4.1.8 - UTF-8 charset by default for all DB/Table/Fields as well
> >as collations
> >myODBC 3.51.10
> >All this is used in ASP pages, or in VB/VBScript programs.
> >
> >Summary of the problem :
> >We have a huge amount of data, in a text file, that we want to import
> >into mySQL for easier management.
> >The format of the text file does not allow to use a parser or anything
> >easy to gather datas from it, and therefore we need to read it
> >sequentially.
> >The problem is, the datas can be in several languages and is in UTF-8.
> >The ASP FileSystemObject cannot handle the file because it's not
> >Unicode UCS-2 but UTF-8.
> >So we decided to use the LOAD DATA INFILE option to load the datas in
> >mySQL, and then use a recordset to browse them and do our own parsing.
> >The problem is that, in certain languages, the datas are corrupted
> >when we read them (and then rewrite them into a clean database
> >structure), and we don't know the exact reason why. Our guess is that
> >it may be caused by the old text protocol used in myODBC 3.51, because
> >all our ASP pages use the UTF-8 Codepage declaration, and when we look
> >at the file imported with mySQL Administrator for instance (which
> >seems to use the new binary protocol), the datas are valid in the huge
> >unformated table.
> >
> >Examples of valid words even with "special characters" :
> >=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3 D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D
> >Fran=C3=A7ais
> >Vi=C3=B0skipti
> >
> >Examples of erronous words with "special characters":
> >=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3 D=3D=3D=3D=3D=
=3D=3D=3D=3D
> >C=C3=A4m=C4=9Fi=C3=A4t
> >A=C5=9Fsu
> >M=C5=B1v=C3=A9szet =C3=A9s kult=C3=BAra
> >
> >We are posting this follow up in the myODBC list because we would like
> >to know if we need to wait for a major release of myODBC or if this
> >could be fixed in a minor one, in the case it is not caused by
> >connection protocol issues but an error in extended characters
> >handling, or whatever. If this could not be fixed before a major
> >release, what would be our solutions to avoid this problem ? Use
> >another connector, and thus another platform than ASP or VBSCript ?
> >
> >Thank you for your reply
> >CheHax
> >
> >
> >
>
>
> --
> Peter Harvey, Software Developer
> MySQL AB, www.mysql.com
>
> Are you MySQL certified? www.mysql.com/certification
>
> --
> MySQL ODBC Mailing List
> For list archives: http://lists.mysql.com/myodbc
> To unsubscribe: http://lists.mysql.com/myodbc?unsub=3Dchehax [at] gmail.com
>
>


--
HMax

--
MySQL ODBC Mailing List
For list archives: http://lists.mysql.com/myodbc
To unsubscribe: http://lists.mysql.com/myodbc?unsub=3Dgcdmo-myodbc [at] m.gmane.o rg
HMax [ Fr, 21 Januar 2005 17:05 ] [ ID #599938 ]

Re: Follow up on myODBC 3.51.10 and UTF-8 with mySQL 4.1.8

HMax,

I have yet to fully investigate charset issues for MyODBC but I can say
that MyODBC does not currently support UNICODE in any real way nor does
it otherwise try to do much special for charsets.

MyODBC 3.53 is to include UNICODE support but not in the Alpha release.
Unfortunately; I can not provide an estimated release date for 3.53 at
this time.

HMax wrote:

>So are the Charset and UTF-8 functionnalies linked and dependant on
>the new protocol ? And do you have an approximate date of release for
>a 3.53 version ?
>
>Thanks
>
>
>On Fri, 21 Jan 2005 07:52:34 -0800, Peter Harvey <pharvey [at] mysql.com> wrote:
>
>
>>HMax,
>>
>>MyODBC 3.51.10 can work with MySQL 4.1 but does not take advantage of
>>new protocol features of MySQL 4.1. MyODBC 3.53 will take advantage of
>>new features in MySQL 4.1 and 5.x.
>>
>>HMax wrote:
>>
>>
>>
>>>Config :
>>>mySQL 4.1.8 - UTF-8 charset by default for all DB/Table/Fields as well
>>>as collations
>>>myODBC 3.51.10
>>>All this is used in ASP pages, or in VB/VBScript programs.
>>>
>>>Summary of the problem :
>>>We have a huge amount of data, in a text file, that we want to import
>>>into mySQL for easier management.
>>>The format of the text file does not allow to use a parser or anything
>>>easy to gather datas from it, and therefore we need to read it
>>>sequentially.
>>>The problem is, the datas can be in several languages and is in UTF-8.
>>>The ASP FileSystemObject cannot handle the file because it's not
>>>Unicode UCS-2 but UTF-8.
>>>So we decided to use the LOAD DATA INFILE option to load the datas in
>>>mySQL, and then use a recordset to browse them and do our own parsing.
>>>The problem is that, in certain languages, the datas are corrupted
>>>when we read them (and then rewrite them into a clean database
>>>structure), and we don't know the exact reason why. Our guess is that
>>>it may be caused by the old text protocol used in myODBC 3.51, because
>>>all our ASP pages use the UTF-8 Codepage declaration, and when we look
>>>at the file imported with mySQL Administrator for instance (which
>>>seems to use the new binary protocol), the datas are valid in the huge
>>>unformated table.
>>>
>>>Examples of valid words even with "special characters" :
>>>========================================================
>>>Français
>>>Viðskipti
>>>
>>>Examples of erronous words with "special characters":
>>>============================
>>>Cämğiät
>>>Aşsu
>>>Művészet és kultúra
>>>
>>>We are posting this follow up in the myODBC list because we would like
>>>to know if we need to wait for a major release of myODBC or if this
>>>could be fixed in a minor one, in the case it is not caused by
>>>connection protocol issues but an error in extended characters
>>>handling, or whatever. If this could not be fixed before a major
>>>release, what would be our solutions to avoid this problem ? Use
>>>another connector, and thus another platform than ASP or VBSCript ?
>>>
>>>Thank you for your reply
>>>CheHax
>>>
>>>
>>>
>>>
>>>
>>--
>>Peter Harvey, Software Developer
>>MySQL AB, www.mysql.com
>>
>>Are you MySQL certified? www.mysql.com/certification
>>
>>--
>>MySQL ODBC Mailing List
>>For list archives: http://lists.mysql.com/myodbc
>>To unsubscribe: http://lists.mysql.com/myodbc?unsub=chehax [at] gmail.com
>>
>>
>>
>>
>
>
>
>


--
Peter Harvey, Software Developer
MySQL AB, www.mysql.com

Are you MySQL certified? www.mysql.com/certification



--
MySQL ODBC Mailing List
For list archives: http://lists.mysql.com/myodbc
To unsubscribe: http://lists.mysql.com/myodbc?unsub=gcdmo-myodbc [at] m.gmane.org
pharvey [ Fr, 21 Januar 2005 18:54 ] [ ID #600620 ]
Datenbanken » gmane.comp.db.mysql.odbc » Follow up on myODBC 3.51.10 and UTF-8 with mySQL 4.1.8

Vorheriges Thema: 3.51.10-2 DSNs fail connection
Nächstes Thema: Problem connection