Re: DataDirect Driver, ExecDirect and UTF-8

--_000_C625D8A0B8ksellgreenplumcom_
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Hi,

I wasn't a member of the mailing list when I sent this, so I'm not sure it =
actually made it out there.
I apologize if this is a duplicate.

.....Ken


On 5/4/09 11:49 AM, "Ken Sell" <ksell [at] greenplum.com> wrote:

Hi,

I'm a the new connectivity developer at GreenPlum. GreenPlum makes a data w=
arehouse DBMS based on PostgreSQL.
I'm working on a problem where a user is attempting to insert a non-ASCII U=
TF-8 values (i.e. An umlaut over an 'o').
The test does an insert via an SQLExecDirectW into a table. The text looks=
like this:

"insert into t1 values ('ö')"

I've built and debugged the postgreSQL driver (version 8.02.0500). It looks=
like the text makes it through the
Driver Manager (i.e. DataDirect) ok. I see the correct value in SQLExecDire=
ctW in odbcapiw.c, but I
also see the code in SQLExecDirectW call ucs2_to_utf8. ucs2_to_utf8 tries t=
o interpret the value as
UCS2, but the value is UTF-8. The value is corrupted by ucs2_to_utf8.

I also attempted to call SQLExecDirect (i.e. no W), but the DataDirect driv=
er manager tries to convert the
umlaut value to ASCII and calls SQLExecDirectW instead.

Can someone elaborate on the driver's correct behavior in this situation? I=
f the database is UTF-8 and
The application is UTF-8, should the driver handle this? Does the applicati=
on (or driver manager) have
to convert the string to UCS2 first?

Thanks,

.....Ken

--_000_C625D8A0B8ksellgreenplumcom_
Content-Type: text/html; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<HTML>
<HEAD>
<TITLE>Re: DataDirect Driver, ExecDirect and UTF-8</TITLE>
</HEAD>
<BODY>
<FONT FACE=3D"Calibri, Verdana, Helvetica, Arial"><SPAN STYLE=3D'font-size:=
11pt'>Hi,<BR>
<BR>
I wasn’t a member of the mailing list when I sent this, so I’m =
not sure it actually made it out there.<BR>
I apologize if this is a duplicate.<BR>
<BR>
.....Ken<BR>
<BR>
<BR>
On 5/4/09 11:49 AM, "Ken Sell" <<a href=3D"ksell [at] greenplum.com=
">ksell [at] greenplum.com</a>> wrote:<BR>
<BR>
</SPAN></FONT><BLOCKQUOTE><FONT FACE=3D"Calibri, Verdana, Helvetica, Arial"=
><SPAN STYLE=3D'font-size:11pt'>Hi,<BR>
<BR>
I’m a the new connectivity developer at GreenPlum. GreenPlum makes a =
data warehouse DBMS based on PostgreSQL.<BR>
I’m working on a problem where a user is attempting to insert a non-A=
SCII UTF-8 values (i.e. An umlaut over an ‘o’).<BR>
The test does an insert via an SQLExecDirectW into a table.  The text =
looks like this:<BR>
<BR>
  "insert into t1 values ('ö')"<BR>
<BR>
I’ve built and debugged the postgreSQL driver (version 8.02.0500). It=
looks like the text makes it through the<BR>
Driver Manager (i.e. DataDirect) ok. I see the correct value in SQLExecDire=
ctW in odbcapiw.c, but I<BR>
also see the code in SQLExecDirectW call ucs2_to_utf8. ucs2_to_utf8 tries t=
o interpret the value as<BR>
UCS2, but the value is UTF-8. The value is corrupted by ucs2_to_utf8.<BR>
<BR>
I also attempted to call SQLExecDirect (i.e. no W), but the DataDirect driv=
er manager tries to convert the<BR>
umlaut value to ASCII and calls SQLExecDirectW instead.<BR>
<BR>
Can someone elaborate on the driver’s correct behavior in this situat=
ion? If the database is UTF-8 and<BR>
The application is UTF-8, should the driver handle this? Does the applicati=
on (or driver manager) have<BR>
to convert the string to UCS2 first?<BR>
 <BR>
Thanks,<BR>
<BR>
.....Ken<BR>
</SPAN></FONT></BLOCKQUOTE>
</BODY>
</HTML>


--_000_C625D8A0B8ksellgreenplumcom_--
Ken Sell [ Di, 05 Mai 2009 20:50 ] [ ID #2000126 ]

Re: DataDirect Driver, ExecDirect and UTF-8

Ken Sell wrote:
> Hi,
>
> I wasn=92t a member of the mailing list when I sent this, so I=92m not =
sure
> it actually made it out there.
> I apologize if this is a duplicate.
>
> ....Ken
>
>
> On 5/4/09 11:49 AM, "Ken Sell" <ksell [at] greenplum.com> wrote:
>
> Hi,
>
> I=92m a the new connectivity developer at GreenPlum. GreenPlum make=
s a
> data warehouse DBMS based on PostgreSQL.
> I=92m working on a problem where a user is attempting to insert a
> non-ASCII UTF-8 values (i.e. An umlaut over an =91o=92).
> The test does an insert via an SQLExecDirectW into a table. The
> text looks like this:
>
> "insert into t1 values ('ö')"
>
> I=92ve built and debugged the postgreSQL driver (version 8.02.0500)=
..
> It looks like the text makes it through the
> Driver Manager (i.e. DataDirect) ok. I see the correct value in
> SQLExecDirectW in odbcapiw.c, but I
> also see the code in SQLExecDirectW call ucs2_to_utf8. ucs2_to_utf8
> tries to interpret the value as
> UCS2, but the value is UTF-8. The value is corrupted by ucs2_to_utf=
8.

Psqlodbc Unicode driver uses UTF-16 encoding and your application uses
UTF-8 encoding. Isn't the URL
http://media.datadirect.com/download/docs/odbc/allodbc/refer ence/unicode6=
..html
related to your problem ?

regards,
Hiroshi Inoue

--
Sent via pgsql-odbc mailing list (pgsql-odbc [at] postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-odbc
Hiroshi Inoue [ Mi, 06 Mai 2009 08:13 ] [ ID #2000315 ]

Re: DataDirect Driver, ExecDirect and UTF-8

--_000_C631E68E133ksellgreenplumcom_
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Hi,

Thanks very much for the information. I think I have it figured out.

Here is my understanding. Feel free to correct me if I have gotten somethin=
g wrong.
In order for everything to work correctly with a UTF-8 application, a DataD=
irect driver manager, and psqlodbc you must do the following.

- Use the header files (e.g. sql.h) from the DataDirect driver manager ins=
tallation.

- Compile the driver with the SQLWCHARSHORT define (e.g. #define SQLWCHARS=
HORT=3D1).
This will cause SQLWCHAR to be defined as an unsigned short.

- Compile the application with the SQLWCHARSHORT define (e.g. #define SQL=
WCHARSHORT=3D1).
This will cause SQLWCHAR to be defined as an unsigned short.

- Set the DriverUnicodeType data source attribute to 1 (i.e. DriverUnicod=
eType=3D1 in your odbc.ini file).
This tells the driver manager that your driver is expecting UTF-16 encod=
ing for all SQLWCHAR
parameters.

- If your database encoding is UTF-8, the descriptor for character columns=
will be a wide type.
For example, a char(1) column will be SQL_WCHAR.

- If you do something with a sql statement (e.g. Do a SQLExecDirectW("inser=
t into T1 values ('a')),
the driver manager (DataDirect) will convert the string from UTF-8 to UTF=
-16, pass it to the
the driver as UTF-16. The driver will then translate it back to UTF-8. Wh=
en the driver is
unicode (i.e your compiled psqlodbc with as a unicode driver), then this =
translation
in the driver manager happens weather or not you use a wide function. The=
driver manager
converts both SQLCHAR and SQLWCHAR parameters to UTF-16. It also changes =
all calls to
non-wide functions to calls to wide functions to the driver (e.g. SQLExec=
Direct gets changed
to SQLExecDirectW). Note, SQLCHAR and SQLWCHAR strings are not translated=
from UTF-8
to UTF-16 in the same way. If the call is to a non-wide function, then th=
e translation assumes
that the string contains only ASCII characters. A non-ASCII character wil=
l probably get
translated incorrectly. A call to wide functions translates UTF-8 to UTF-=
16 correctly.

- When using parameters, you must be use the appropriate SQL and C type wh=
en describing
the parameters column values. If you say the C type is SQL_C_CHAR, the =
value is assumed
to be in UTF-8 by the driver for a database encoded in UTF-8. IF you sa=
y the C type is
SQL_C_WCHAR, I think the driver will assume the value is in UTF-16 and t=
ry to translate
it to UTF-8.

- When using column values, you must use the appropriate SQL C type. If yo=
u say the column C
type is SQL_C_WCHAR, then the returned value will be translated to UTF-1=
6 by the driver.
If you say the column C type is SQL_C_CHAR, then the value will be retur=
ned as UTF-8.

.....Ken


On 5/5/09 11:13 PM, "Hiroshi Inoue" <inoue [at] tpf.co.jp> wrote:

Ken Sell wrote:
> Hi,
>
> I wasn't a member of the mailing list when I sent this, so I'm not sure
> it actually made it out there.
> I apologize if this is a duplicate.
>
> ....Ken
>
>
> On 5/4/09 11:49 AM, "Ken Sell" <ksell [at] greenplum.com> wrote:
>
> Hi,
>
> I'm a the new connectivity developer at GreenPlum. GreenPlum makes a
> data warehouse DBMS based on PostgreSQL.
> I'm working on a problem where a user is attempting to insert a
> non-ASCII UTF-8 values (i.e. An umlaut over an 'o').
> The test does an insert via an SQLExecDirectW into a table. The
> text looks like this:
>
> "insert into t1 values ('ö')"
>
> I've built and debugged the postgreSQL driver (version 8.02.0500).
> It looks like the text makes it through the
> Driver Manager (i.e. DataDirect) ok. I see the correct value in
> SQLExecDirectW in odbcapiw.c, but I
> also see the code in SQLExecDirectW call ucs2_to_utf8. ucs2_to_utf8
> tries to interpret the value as
> UCS2, but the value is UTF-8. The value is corrupted by ucs2_to_utf8.

Psqlodbc Unicode driver uses UTF-16 encoding and your application uses
UTF-8 encoding. Isn't the URL
http://media.datadirect.com/download/docs/odbc/allodbc/refer ence/unicode6.h=
tml
related to your problem ?

regards,
Hiroshi Inoue


--_000_C631E68E133ksellgreenplumcom_
Content-Type: text/html; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<HTML>
<HEAD>
<TITLE>Re: [ODBC] DataDirect Driver, ExecDirect and UTF-8</TITLE>
</HEAD>
<BODY>
<FONT FACE=3D"Calibri, Verdana, Helvetica, Arial"><SPAN STYLE=3D'font-size:=
11pt'>Hi,<BR>
<BR>
Thanks very much for the information. I think I have it figured out.<BR>
<BR>
Here is my understanding. Feel free to correct me if I have gotten somethin=
g wrong.<BR>
In order for everything to work correctly with a UTF-8 application, a DataD=
irect driver manager, and psqlodbc you must do the following.<BR>
<BR>
 - Use the header files (e.g. sql.h) from the DataDirect driver manage=
r installation.<BR>
<BR>
 - Compile the driver with the SQLWCHARSHORT define (e.g. #define SQLW=
CHARSHORT=3D1).<BR>
    This will cause SQLWCHAR to be defined as an unsign=
ed short.<BR>
<BR>
  - Compile the application with the SQLWCHARSHORT define (e.g. #=
define SQLWCHARSHORT=3D1).<BR>
    This will cause SQLWCHAR to be defined as an unsign=
ed short.<BR>
<BR>
 - Set the DriverUnicodeType data source attribute to 1  (i.e. Dr=
iverUnicodeType=3D1 in your odbc.ini file).<BR>
   This tells the driver manager that your driver is expecti=
ng UTF-16 encoding for all SQLWCHAR <BR>
   parameters.<BR>
<BR>
 - If your database encoding is UTF-8, the descriptor for character co=
lumns will be a wide type. <BR>
   For example, a char(1) column will be SQL_WCHAR.<BR>
<BR>
- If you do something with a sql statement (e.g. Do a SQLExecDirectW(“=
;insert into T1 values (‘a’)), <BR>
  the driver manager (DataDirect) will convert the string from UT=
F-8 to UTF-16, pass it to the<BR>
  the driver as UTF-16. The driver will then translate it back to=
UTF-8. When the driver is <BR>
  unicode (i.e your compiled psqlodbc with as a unicode driver), =
then this translation<BR>
  in the driver manager happens weather or not you use a wide fun=
ction. The driver manager<BR>
  converts both SQLCHAR and SQLWCHAR parameters to UTF-16. It als=
o changes all calls to <BR>
  non-wide functions to calls to wide functions to the driver (e.=
g. SQLExecDirect gets changed<BR>
  to SQLExecDirectW). Note, SQLCHAR and SQLWCHAR strings are not =
translated from UTF-8<BR>
  to UTF-16 in the same way. If the call is to a non-wide functio=
n, then the translation assumes<BR>
  that the string contains only ASCII characters. A non-ASCII cha=
racter will probably get <BR>
  translated incorrectly. A call to wide functions translates UTF=
-8 to UTF-16 correctly.<BR>
<BR>
 - When using parameters, you must be use the appropriate SQL and C ty=
pe when describing <BR>
    the parameters column values. If you say the C type=
is SQL_C_CHAR, the value is assumed<BR>
    to be in UTF-8 by the driver for a database encoded=
in UTF-8. IF you say the C type is<BR>
   SQL_C_WCHAR, I think the driver will assume the value is =
in UTF-16 and try to translate<BR>
   it to UTF-8.<BR>
<BR>
 - When using column values, you must use the appropriate SQL C type. =
If you say the column C<BR>
   type is SQL_C_WCHAR, then the returned value will be tran=
slated to UTF-16 by the driver.<BR>
   If you say the column C type is SQL_C_CHAR, then the valu=
e will be returned as UTF-8.<BR>
<BR>
.....Ken<BR>
<BR>
<BR>
On 5/5/09 11:13 PM, "Hiroshi Inoue" <<a href=3D"inoue [at] tpf.co.j=
p">inoue [at] tpf.co.jp</a>> wrote:<BR>
<BR>
</SPAN></FONT><BLOCKQUOTE><FONT FACE=3D"Calibri, Verdana, Helvetica, Arial"=
><SPAN STYLE=3D'font-size:11pt'>Ken Sell wrote:<BR>
> Hi,<BR>
><BR>
> I wasn’t a member of the mailing list when I sent this, so IR=
17;m not sure<BR>
> it actually made it out there.<BR>
> I apologize if this is a duplicate.<BR>
><BR>
> ....Ken<BR>
><BR>
><BR>
> On 5/4/09 11:49 AM, "Ken Sell" <<a href=3D"ksell [at] greenplu=
m.com">ksell [at] greenplum.com</a>> wrote:<BR>
><BR>
>     Hi,<BR>
><BR>
>     I’m a the new connectivity developer at =
GreenPlum. GreenPlum makes a<BR>
>     data warehouse DBMS based on PostgreSQL.<BR>
>     I’m working on a problem where a user is=
attempting to insert a<BR>
>     non-ASCII UTF-8 values (i.e. An umlaut over an=
‘o’).<BR>
>     The test does an insert via an SQLExecDirectW =
into a table.  The<BR>
>     text looks like this:<BR>
><BR>
>       "insert into t1 values ('&oum=
l;')"<BR>
><BR>
>     I’ve built and debugged the postgreSQL d=
river (version 8.02.0500).<BR>
>     It looks like the text makes it through the<BR=
>
>     Driver Manager (i.e. DataDirect) ok. I see the=
correct value in<BR>
>     SQLExecDirectW in odbcapiw.c, but I<BR>
>     also see the code in SQLExecDirectW call ucs2_=
to_utf8. ucs2_to_utf8<BR>
>     tries to interpret the value as<BR>
>     UCS2, but the value is UTF-8. The value is cor=
rupted by ucs2_to_utf8.<BR>
<BR>
Psqlodbc Unicode driver uses UTF-16 encoding and your application uses<BR>
UTF-8 encoding. Isn't the URL<BR>
<a href=3D"http://media.datadirect.com/download/docs/odbc/allodbc/reference=
/unicode6.html">http://media.datadirect.com/download/docs/od bc/allodbc/refe=
rence/unicode6.html</a><BR>
related to your problem ?<BR>
<BR>
regards,<BR>
Hiroshi Inoue<BR>
<BR>
</SPAN></FONT></BLOCKQUOTE>
</BODY>
</HTML>


--_000_C631E68E133ksellgreenplumcom_--
Ken Sell [ Fr, 15 Mai 2009 00:17 ] [ ID #2001433 ]
Datenbanken » gmane.comp.db.postgresql.odbc » Re: DataDirect Driver, ExecDirect and UTF-8

Vorheriges Thema: [Q] SQLMoreResults causes error in SQLFetchScroll
Nächstes Thema: psqlodbc driver returning empty string