
Issues with LC_COLLATE, across 8.3/8.4 on various platforms
Hi, i have the following problem, text ordering seems to behave incosistently across various lc_collate values, OS'es, PostgreSQL versions.
Some behaviour might be expected, some not, thats why i am asking to see where i stand with this.
Test Data
postgres [at] dynacom=# SELECT * from test_sort_order;
fooname
-------------------------
Cylinder head cover No1
Cylinder Liner No1
Cylinder head No1
(3 rows)
Now the query
# SELECT * from test_sort_order order by fooname;
in PostgreSQL 8.3.7 on x86_64-unknown-linux-gnu (lc_collate=en_US.UTF-8) gives
fooname
-------------------------
Cylinder head cover No1
Cylinder head No1
Cylinder Liner No1
while in all of
PostgreSQL 8.3.3 on i686-pc-linux-gnu (lc_collate=C)
PostgreSQL 8.3.7 (lc_collate=el_GR.UTF-8) on i386-unknown-freebsd6.3
PostgreSQL 8.4.1 on x86_64-unknown-freebsd8.0 (lc_collate=en_US.UTF-8), gives
fooname
-------------------------
Cylinder Liner No1
Cylinder head No1
Cylinder head cover No1
Database encoding is SQL_ASCII in all four cases.
--
Achilleas Mantzios
--
Sent via pgsql-admin mailing list (pgsql-admin [at] postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Re: Issues with LC_COLLATE, across 8.3/8.4 on various platforms
It seems that libc's behaviour regarding LC_COLLATE in Linux/FreeBSD are tw=
o different stories, hence the seen results.
(Switching LC_COLLATE between en_US.UTF-8 and POSIX, in FreeBSD it does not=
have any impact on orderring while in Linux does)
I think i have resolved the issue, so the solution boils down to setting LC=
_COLLATE=3DC in our linux central production postgresql server.
What is the shortest way to change the default locale on a production insta=
llation running PostgreSQL-8.3.9?
Is there anything less painful than dump, initdb,restore?
I know 8.4 allows for per-db locale settings, i am just asking if there is =
a way to solve the issue now without waiting till
the migration to 8.4
Thanx!
=CE=A3=CF=84=CE=B9=CF=82 Tuesday 02 February 2010 14:26:44 =CE=BF/=CE=B7 Ac=
hilleas Mantzios =CE=AD=CE=B3=CF=81=CE=B1=CF=88=CE=B5:
> Hi, i have the following problem, text ordering seems to behave incosiste=
ntly across various lc_collate values, OS'es, PostgreSQL versions.
> Some behaviour might be expected, some not, thats why i am asking to see =
where i stand with this.
> Test Data
> postgres [at] dynacom=3D# SELECT * from test_sort_order;
> fooname
> -------------------------
> Cylinder head cover No1
> Cylinder Liner No1
> Cylinder head No1
> (3 rows)
>
> Now the query
> # SELECT * from test_sort_order order by fooname;
> in PostgreSQL 8.3.7 on x86_64-unknown-linux-gnu (lc_collate=3Den_US.UTF-=
8) gives
> fooname
> -------------------------
> Cylinder head cover No1
> Cylinder head No1
> Cylinder Liner No1
>
> while in all of
> PostgreSQL 8.3.3 on i686-pc-linux-gnu (lc_collate=3DC)
> PostgreSQL 8.3.7 (lc_collate=3Del_GR.UTF-8) on i386-unknown-freebsd6.3
> PostgreSQL 8.4.1 on x86_64-unknown-freebsd8.0 (lc_collate=3Den_US.UTF-8),=
gives
> fooname
> -------------------------
> Cylinder Liner No1
> Cylinder head No1
> Cylinder head cover No1
>
> Database encoding is SQL_ASCII in all four cases.
>
> --
> Achilleas Mantzios
>
--
Achilleas Mantzios
--
Sent via pgsql-admin mailing list (pgsql-admin [at] postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Re: Issues with LC_COLLATE, across 8.3/8.4 on various platforms
Achilleas Mantzios <achill [at] matrix.gatewaynet.com> writes:
> What is the shortest way to change the default locale on a production installation running PostgreSQL-8.3.9?
> Is there anything less painful than dump, initdb,restore?
No :-(
regards, tom lane
--
Sent via pgsql-admin mailing list (pgsql-admin [at] postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Re: Issues with LC_COLLATE, across 8.3/8.4 on various platforms
=CE=A3=CF=84=CE=B9=CF=82 Tuesday 02 February 2010 17:37:05 =CE=BF/=CE=B7 To=
m Lane =CE=AD=CE=B3=CF=81=CE=B1=CF=88=CE=B5:
> Achilleas Mantzios <achill [at] matrix.gatewaynet.com> writes:
> > What is the shortest way to change the default locale on a production i=
nstallation running PostgreSQL-8.3.9?
> > Is there anything less painful than dump, initdb,restore?
>
> No :-(
>
Thanx Tom,
taking into account that the large and most important DB in our installatio=
n (over 100Gb as SQL dump) lives in a tablespace of its own,
is there a way to "cheat" postgresql after initdb in order to see this tabl=
espace and the database in it, avoiding the whole cycle?
i think this is not a good idea, i just thought i could ask, just to make s=
ure.
> regards, tom lane
>
--
Achilleas Mantzios
--
Sent via pgsql-admin mailing list (pgsql-admin [at] postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Re: Issues with LC_COLLATE, across 8.3/8.4 on variousplatforms
On tis, 2010-02-02 at 17:28 +0200, Achilleas Mantzios wrote:
> It seems that libc's behaviour regarding LC_COLLATE in Linux/FreeBSD
> are two different stories, hence the seen results.
The UTF-8 locales on FreeBSD and Mac OS X are pretty much broken.
This is becoming a FAQ ...
--
Sent via pgsql-admin mailing list (pgsql-admin [at] postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Re: Issues with LC_COLLATE, across 8.3/8.4 on various platforms
=CE=A3=CF=84=CE=B9=CF=82 Wednesday 03 February 2010 16:53:20 =CE=BF/=CE=B7 =
Peter Eisentraut =CE=AD=CE=B3=CF=81=CE=B1=CF=88=CE=B5:
> On tis, 2010-02-02 at 17:28 +0200, Achilleas Mantzios wrote:
> > It seems that libc's behaviour regarding LC_COLLATE in Linux/FreeBSD
> > are two different stories, hence the seen results.
>
> The UTF-8 locales on FreeBSD and Mac OS X are pretty much broken.
>
> This is becoming a FAQ ...
This is a libc/locale combination issue. Care to elaborate?
AFAICT for my case in FreeBSD, the greek UTF locale el_GR.UTF-8 behaves as =
expected without problems.
>
>
--
Achilleas Mantzios
--
Sent via pgsql-admin mailing list (pgsql-admin [at] postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Re: Issues with LC_COLLATE, across 8.3/8.4 on variousplatforms
On ons, 2010-02-03 at 17:05 +0200, Achilleas Mantzios wrote:
> =CE=A3=CF=84=CE=B9=CF=82 Wednesday 03 February 2010 16:53:20 =CE=BF/=CE=
=B7 Peter Eisentraut =CE=AD=CE=B3=CF=81=CE=B1=CF=88=CE=B5:
> > On tis, 2010-02-02 at 17:28 +0200, Achilleas Mantzios wrote:
> > > It seems that libc's behaviour regarding LC_COLLATE in Linux/FreeBS=
D
> > > are two different stories, hence the seen results.
> >
> > The UTF-8 locales on FreeBSD and Mac OS X are pretty much broken.
> >
> > This is becoming a FAQ ...
>
> This is a libc/locale combination issue. Care to elaborate?
> AFAICT for my case in FreeBSD, the greek UTF locale el_GR.UTF-8 behaves=
as expected without problems.
Well,
"""
while in all of
PostgreSQL 8.3.3 on i686-pc-linux-gnu (lc_collate=3DC)
PostgreSQL 8.3.7 (lc_collate=3Del_GR.UTF-8) on i386-unknown-freebsd6.3
PostgreSQL 8.4.1 on x86_64-unknown-freebsd8.0 (lc_collate=3Den_US.UTF-8),
gives
fooname
-------------------------
Cylinder Liner No1
Cylinder head No1
Cylinder head cover No1
"""
is not a correct sorting result in my book, unless your measure of
correctness is ASCII byte value order. And then it's left as an
exercise to determine what the results would be for texts involving
non-ASCII letters.
--
Sent via pgsql-admin mailing list (pgsql-admin [at] postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
Re: Issues with LC_COLLATE, across 8.3/8.4 on various platforms
=CE=A3=CF=84=CE=B9=CF=82 Wednesday 03 February 2010 21:16:36 =CE=BF/=CE=B7 =
Peter Eisentraut =CE=AD=CE=B3=CF=81=CE=B1=CF=88=CE=B5:
> On ons, 2010-02-03 at 17:05 +0200, Achilleas Mantzios wrote:
> > =CE=A3=CF=84=CE=B9=CF=82 Wednesday 03 February 2010 16:53:20 =CE=BF/=CE=
=B7 Peter Eisentraut =CE=AD=CE=B3=CF=81=CE=B1=CF=88=CE=B5:
> > > On tis, 2010-02-02 at 17:28 +0200, Achilleas Mantzios wrote:
> > > > It seems that libc's behaviour regarding LC_COLLATE in Linux/FreeBSD
> > > > are two different stories, hence the seen results.
> > >
> > > The UTF-8 locales on FreeBSD and Mac OS X are pretty much broken.
> > >
> > > This is becoming a FAQ ...
^^^^
I guess you are right. In my new FreeBSD system i get:
postgres [at] smadevnew:~> uname -sr
FreeBSD 8.0-RELEASE-p2
postgres [at] smadevnew:~> ls -l /usr/share/locale/en_US.UTF-8/
total 0
lrwxr-xr-x 1 root wheel 28 Nov 21 16:30 LC_COLLATE -> ../la_LN.US-ASCII/=
LC_COLLATE
lrwxr-xr-x 1 root wheel 17 Nov 21 16:30 LC_CTYPE -> ../UTF-8/LC_CTYPE
lrwxr-xr-x 1 root wheel 30 Nov 21 16:30 LC_MESSAGES -> ../en_US.ISO8859-=
1/LC_MESSAGES
lrwxr-xr-x 1 root wheel 30 Nov 21 16:30 LC_MONETARY -> ../en_US.ISO8859-=
1/LC_MONETARY
lrwxr-xr-x 1 root wheel 29 Nov 21 16:30 LC_NUMERIC -> ../en_US.ISO8859-1=
/LC_NUMERIC
lrwxr-xr-x 1 root wheel 26 Nov 21 16:30 LC_TIME -> ../en_US.ISO8859-1/LC=
_TIME
>
>
--
Achilleas Mantzios
--
Sent via pgsql-admin mailing list (pgsql-admin [at] postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin