subquery fails when a NOT IN operator tests a subset with NULL valu

Hi.
Here is a description of what looks like a serious bug.
This is related to bugs #7294 and #6247

Tested against mysql 4.1.9 and 4.1.10.

Cheers

Giuseppe Maxia

>Description:
operator NOT IN fails when a subquery returns one or more NULL values.
>How-To-Repeat:
simple proof of concept:
mysql> select 1 in (1,null,3);
+-----------------+
| 1 in (1,null,3) |
+-----------------+
| 1 |
+-----------------+
1 row in set (0.00 sec)
#OK

mysql> select 2 not in (1,null,3);
+---------------------+
| 2 not in (1,null,3) |
+---------------------+
| NULL |
+---------------------+
1 row in set (0.00 sec)

# NOT OK

More complete proof:

mysql> drop table if exists t1;
Query OK, 0 rows affected (0.00 sec)

mysql> drop table if exists t2;
Query OK, 0 rows affected (0.06 sec)

mysql> create table t1 (id int not null auto_increment primary key, c1 int);
Query OK, 0 rows affected (0.01 sec)

mysql>
mysql> create table t2 (id int not null auto_increment primary key, c2 int);
Query OK, 0 rows affected (0.02 sec)

mysql> insert into t1(c1) values (1),(2);
Query OK, 2 rows affected (0.01 sec)
Records: 2 Duplicates: 0 Warnings: 0

mysql> insert into t2(c2) values (2),(null),(3);
Query OK, 3 rows affected (0.00 sec)
Records: 3 Duplicates: 0 Warnings: 0

mysql> select * from t1;
+----+------+
| id | c1 |
+----+------+
| 1 | 1 |
| 2 | 2 |
+----+------+
2 rows in set (0.01 sec)

mysql> select * from t2;
+----+------+
| id | c2 |
+----+------+
| 1 | 2 |
| 2 | NULL |
| 3 | 3 |
+----+------+
3 rows in set (0.00 sec)

mysql> select t1.* from t1 left join t2 on (c1=c2) where t2.id is null;
+----+------+
| id | c1 |
+----+------+
| 1 | 1 |
+----+------+
1 row in set (0.01 sec)

# OK. This is the normal way of checking for non-existence of records in a
# related table

mysql> select t1.* from t1 where c1 not in (select distinct c2 from t2);
Empty set (0.01 sec)

# NOT OK. This query should have returned the same result as the previous one

mysql> select t1.* from t1 where c1 not in (select distinct c2 from t2 where c2 is not null);
+----+------+
| id | c1 |
+----+------+
| 1 | 1 |
+----+------+
1 row in set (0.01 sec)

# ugly workaround with an express filter

mysql> select t1.* from t1 where c1 not in (select distinct coalesce(c2,0) from t2 );
+----+------+
| id | c1 |
+----+------+
| 1 | 1 |
+----+------+
1 row in set (0.01 sec)

# yet another ugly workaround

>Fix:
as a temporary workaround, filter off the NULLs with
a WHERE clause or a COALESCE function.


>Submitter-Id: <submitter ID>
>Originator: Giuseppe Maxia
>Organization:
Stardata s.r.l
>MySQL support: Certified Consulting Partner
>Synopsis: subquery fails on test with NOT IN and NULL values
>Severity: serious
>Priority: high
>Category: mysql
>Class: sw-bug
>Release: mysql-4.1.10-standard (MySQL Community Edition - Standard (GPL))
>Server: /usr/local/mysql/bin/mysqladmin Ver 8.41 Distrib 4.1.10, for pc-linux-gnu on i686
Copyright (C) 2000 MySQL AB & MySQL Finland AB & TCX DataKonsult AB
This software comes with ABSOLUTELY NO WARRANTY. This is free software,
and you are welcome to modify and redistribute it under the GPL license

Server version 4.1.10-standard-log
Protocol version 10
Connection Localhost via UNIX socket
UNIX socket /tmp/mysql.sock
Uptime: 41 min 36 sec

Threads: 2 Questions: 111 Slow queries: 0 Opens: 32 Flush tables: 1 Open tables: 2 Queries per second avg: 0.044
>C compiler: 2.95.3
>C++ compiler: 2.95.3
>Environment:
<machine, os, target, libraries (multiple lines)>
System: Linux ltstardata 2.6.9-1.667 #1 Tue Nov 2 14:41:25 EST 2004 i686 i686 i386 GNU/Linux
Architecture: i686

Some paths: /usr/bin/perl /usr/bin/make /usr/bin/gmake /usr/bin/gcc /usr/bin/cc
GCC: Reading specs from /usr/lib/gcc/i386-redhat-linux/3.4.2/specs
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared
--enable-threads=posix --disable-checking --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions
--enable-java-awt=gtk --host=i386-redhat-linux
Thread model: posix
gcc version 3.4.2 20041017 (Red Hat 3.4.2-6.fc3)
Compilation info: CC='gcc' CFLAGS='-O2 -mcpu=pentiumpro' CXX='gcc' CXXFLAGS='-O2 -mcpu=pentiumpro
-felide-constructors' LDFLAGS='' ASFLAGS=''
LIBC:
lrwxrwxrwx 1 root root 13 Feb 12 14:24 /lib/libc.so.6 -> libc-2.3.3.so
-rwxr-xr-x 1 root root 1504728 Oct 28 01:00 /lib/libc-2.3.3.so
-rw-r--r-- 1 root root 2404716 Oct 28 00:46 /usr/lib/libc.a
-rw-r--r-- 1 root root 204 Oct 28 00:08 /usr/lib/libc.so
Configure command: ./configure '--prefix=/usr/local/mysql' '--localstatedir=/usr/local/mysql/data'
'--libexecdir=/usr/local/mysql/bin' '--with-comment=MySQL Community Edition - Standard (GPL)'
'--with-extra-charsets=complex' '--with-server-suffix=-standard' '--enable-thread-safe-client' '--enable-local-infile'
'--enable-assembler' '--disable-shared' '--with-client-ldflags=-all-static' '--with-mysqld-ldflags=-all-static'
'--with-readline' '--with-embedded-server' '--with-archive-storage-engine' '--with-innodb' 'CFLAGS=-O2 -mcpu=pentiumpro'
'CPPFLAGS=-DDEFAULT_THREAD_STACK=126976' 'CXXFLAGS=-O2 -mcpu=pentiumpro -felide-constructors' 'CXX=gcc'


--
Giuseppe Maxia
CTO
http://www.StarData.it
MySQL Certified Professional
__ __ __
___ / /____ ________/ /__ _/ /____ _
(_-</ __/ _ `/ __/ _ / _ `/ __/ _ `/
/___/\__/\_,_/_/ \_,_/\_,_/\__/\_,_/
Database is our business

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql [at] m.gmane.org
Giuseppe Maxia [ Mi, 16 März 2005 07:40 ] [ ID #698015 ]

Re: subquery fails when a NOT IN operator tests a subset with NULL valu

In the last episode (Mar 16), Giuseppe Maxia said:
> Here is a description of what looks like a serious bug. This is
> related to bugs #7294 and #6247
>
> Tested against mysql 4.1.9 and 4.1.10.
>
> >Description:
> operator NOT IN fails when a subquery returns one or more NULL
> values.
> >How-To-Repeat:
> simple proof of concept:
> mysql> select 1 in (1,null,3);
> +-----------------+
> | 1 in (1,null,3) |
> +-----------------+
> | 1 |
> +-----------------+
> #OK
>
> mysql> select 2 not in (1,null,3);
> +---------------------+
> | 2 not in (1,null,3) |
> +---------------------+
> | NULL |
> +---------------------+
> # NOT OK

This looks okay to me, according to
http://dev.mysql.com/doc/mysql/en/comparison-operators.html# id2940868 :

To comply with the SQL standard, from MySQL 4.1 on IN returns
NULL not only if the expression on the left hand side is NULL,
but also if no match is found in the list and one of the
expressions in the list is NULL.

2 doesn't match 1, NULL, or 3, and there's a NULL in the list, so the
IN expression must return NULL. NOT(NULL) is still NULL, so the entire
expression returns NULL.

Subqueries using IN() may not be the same as the IN() expression (I
rarely use subqueries so I don't know); they are documented at
http://dev.mysql.com/doc/mysql/en/any-in-some-subqueries.htm l and
http://dev.mysql.com/doc/mysql/en/all-subqueries.html .

--
Dan Nelson
dnelson [at] allantgroup.com

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql [at] m.gmane.org
Dan Nelson [ Mi, 16 März 2005 09:03 ] [ ID #698016 ]

Re: subquery fails when a NOT IN operator tests a subset with NULLvalu

Dan Nelson wrote:
> In the last episode (Mar 16), Giuseppe Maxia said:
>
>>Here is a description of what looks like a serious bug. This is
>>related to bugs #7294 and #6247
>>
>>Tested against mysql 4.1.9 and 4.1.10.
>>
>>
>>>Description:
>>
>> operator NOT IN fails when a subquery returns one or more NULL
>> values.
>>
>>>How-To-Repeat:
>>
>> simple proof of concept:
>>mysql> select 1 in (1,null,3);
>>+-----------------+
>>| 1 in (1,null,3) |
>>+-----------------+
>>| 1 |
>>+-----------------+
>>#OK
>>
>>mysql> select 2 not in (1,null,3);
>>+---------------------+
>>| 2 not in (1,null,3) |
>>+---------------------+
>>| NULL |
>>+---------------------+
>># NOT OK
>
>
> This looks okay to me, according to
> http://dev.mysql.com/doc/mysql/en/comparison-operators.html# id2940868 :
>
> To comply with the SQL standard, from MySQL 4.1 on IN returns
> NULL not only if the expression on the left hand side is NULL,
> but also if no match is found in the list and one of the
> expressions in the list is NULL.
>

Thanks for your comment.
I knew about that page, and probably I was a bit too quick when I used this example as proof of concept.
Actually, the real problem happens only with subqueries, as I reported in the rest of my previous message.


> 2 doesn't match 1, NULL, or 3, and there's a NULL in the list, so the
> IN expression must return NULL. NOT(NULL) is still NULL, so the entire
> expression returns NULL.
>
> Subqueries using IN() may not be the same as the IN() expression (I
> rarely use subqueries so I don't know); they are documented at
> http://dev.mysql.com/doc/mysql/en/any-in-some-subqueries.htm l and
> http://dev.mysql.com/doc/mysql/en/all-subqueries.html .

The whole point is actually in subqueries, not when using IN or NOT IN in a normal query.
The bug occurs when a NOT IN is used in a subquery as a LEFT JOIN replacement.

SELECT something from t1 where column1 NOT IN (SELECT nullable_column from t2);


BTW, I posted to this list by mistake.
I re-posted an amended version of the same bug report to the bugs list.

Giuseppe Maxia

--
Giuseppe Maxia
CTO
http://www.StarData.it
MySQL Certified Professional
__ __ __
___ / /____ ________/ /__ _/ /____ _
(_-</ __/ _ `/ __/ _ / _ `/ __/ _ `/
/___/\__/\_,_/_/ \_,_/\_,_/\__/\_,_/
Database is our business

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql [at] m.gmane.org
Giuseppe Maxia [ Mi, 16 März 2005 09:34 ] [ ID #698017 ]

Re: subquery fails when a NOT IN operator tests a subset with NULL valu

In article <4237EF79.3090906 [at] stardata.it>,
Giuseppe Maxia <g.maxia [at] stardata.it> writes:

> The whole point is actually in subqueries, not when using IN or NOT IN in a normal query.
> The bug occurs when a NOT IN is used in a subquery as a LEFT JOIN replacement.

> SELECT something from t1 where column1 NOT IN (SELECT nullable_column from t2);

That's not a bug. Let's say that the subquery returns 2, NULL, 3.
Thus the NOT IN is a shorthand for

column1 != 2 AND column1 != NULL AND column1 != 3

Since the second condition is never true, you get an empty result set.


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql [at] m.gmane.org
Harald Fuchs [ Mi, 16 März 2005 10:42 ] [ ID #698018 ]

Re: subquery fails when a NOT IN operator tests a subset with NULLvalu

Giuseppe,

>mysql> select 2 not in (1,null,3);
>+---------------------+
>| 2 not in (1,null,3) |
>+---------------------+
>| NULL |
>+---------------------+
>1 row in set (0.00 sec)

># NOT OK

Isn't that standard SQL behaviour? NULL is not a value. NOT IN compares
the values using '=' and correctly returns NULL if any value is NULL ie
missing, eg for Oracle see http://builder.com.com/5100-6388_14-5319615.html

PB

-----

Giuseppe Maxia wrote:

> Hi.
> Here is a description of what looks like a serious bug.
> This is related to bugs #7294 and #6247
>
> Tested against mysql 4.1.9 and 4.1.10.
>
> Cheers
>
> Giuseppe Maxia
>
> >Description:
> operator NOT IN fails when a subquery returns one or more NULL
> values.
> >How-To-Repeat:
> simple proof of concept:
> mysql> select 1 in (1,null,3);
> +-----------------+
> | 1 in (1,null,3) |
> +-----------------+
> | 1 |
> +-----------------+
> 1 row in set (0.00 sec)
> #OK
>
> mysql> select 2 not in (1,null,3);
> +---------------------+
> | 2 not in (1,null,3) |
> +---------------------+
> | NULL |
> +---------------------+
> 1 row in set (0.00 sec)
>
> # NOT OK
>
> More complete proof:
>
> mysql> drop table if exists t1;
> Query OK, 0 rows affected (0.00 sec)
>
> mysql> drop table if exists t2;
> Query OK, 0 rows affected (0.06 sec)
>
> mysql> create table t1 (id int not null auto_increment primary key, c1
> int);
> Query OK, 0 rows affected (0.01 sec)
>
> mysql>
> mysql> create table t2 (id int not null auto_increment primary key, c2
> int);
> Query OK, 0 rows affected (0.02 sec)
>
> mysql> insert into t1(c1) values (1),(2);
> Query OK, 2 rows affected (0.01 sec)
> Records: 2 Duplicates: 0 Warnings: 0
>
> mysql> insert into t2(c2) values (2),(null),(3);
> Query OK, 3 rows affected (0.00 sec)
> Records: 3 Duplicates: 0 Warnings: 0
>
> mysql> select * from t1;
> +----+------+
> | id | c1 |
> +----+------+
> | 1 | 1 |
> | 2 | 2 |
> +----+------+
> 2 rows in set (0.01 sec)
>
> mysql> select * from t2;
> +----+------+
> | id | c2 |
> +----+------+
> | 1 | 2 |
> | 2 | NULL |
> | 3 | 3 |
> +----+------+
> 3 rows in set (0.00 sec)
>
> mysql> select t1.* from t1 left join t2 on (c1=c2) where t2.id is null;
> +----+------+
> | id | c1 |
> +----+------+
> | 1 | 1 |
> +----+------+
> 1 row in set (0.01 sec)
>
> # OK. This is the normal way of checking for non-existence of records
> in a
> # related table
>
> mysql> select t1.* from t1 where c1 not in (select distinct c2 from t2);
> Empty set (0.01 sec)
>
> # NOT OK. This query should have returned the same result as the
> previous one
>
> mysql> select t1.* from t1 where c1 not in (select distinct c2 from t2
> where c2 is not null);
> +----+------+
> | id | c1 |
> +----+------+
> | 1 | 1 |
> +----+------+
> 1 row in set (0.01 sec)
>
> # ugly workaround with an express filter
>
> mysql> select t1.* from t1 where c1 not in (select distinct
> coalesce(c2,0) from t2 );
> +----+------+
> | id | c1 |
> +----+------+
> | 1 | 1 |
> +----+------+
> 1 row in set (0.01 sec)
>
> # yet another ugly workaround
>
> >Fix:
> as a temporary workaround, filter off the NULLs with
> a WHERE clause or a COALESCE function.
>
>
> >Submitter-Id: <submitter ID>
> >Originator: Giuseppe Maxia
> >Organization:
> Stardata s.r.l
> >MySQL support: Certified Consulting Partner
> >Synopsis: subquery fails on test with NOT IN and NULL values
> >Severity: serious
> >Priority: high
> >Category: mysql
> >Class: sw-bug
> >Release: mysql-4.1.10-standard (MySQL Community Edition - Standard
> (GPL))
> >Server: /usr/local/mysql/bin/mysqladmin Ver 8.41 Distrib 4.1.10, for
> pc-linux-gnu on i686
> Copyright (C) 2000 MySQL AB & MySQL Finland AB & TCX DataKonsult AB
> This software comes with ABSOLUTELY NO WARRANTY. This is free software,
> and you are welcome to modify and redistribute it under the GPL license
>
> Server version 4.1.10-standard-log
> Protocol version 10
> Connection Localhost via UNIX socket
> UNIX socket /tmp/mysql.sock
> Uptime: 41 min 36 sec
>
> Threads: 2 Questions: 111 Slow queries: 0 Opens: 32 Flush tables:
> 1 Open tables: 2 Queries per second avg: 0.044
> >C compiler: 2.95.3
> >C++ compiler: 2.95.3
> >Environment:
> <machine, os, target, libraries (multiple lines)>
> System: Linux ltstardata 2.6.9-1.667 #1 Tue Nov 2 14:41:25 EST 2004
> i686 i686 i386 GNU/Linux
> Architecture: i686
>
> Some paths: /usr/bin/perl /usr/bin/make /usr/bin/gmake /usr/bin/gcc
> /usr/bin/cc
> GCC: Reading specs from /usr/lib/gcc/i386-redhat-linux/3.4.2/specs
> Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
> --infodir=/usr/share/info --enable-shared --enable-threads=posix
> --disable-checking --with-system-zlib --enable-__cxa_atexit
> --disable-libunwind-exceptions --enable-java-awt=gtk
> --host=i386-redhat-linux
> Thread model: posix
> gcc version 3.4.2 20041017 (Red Hat 3.4.2-6.fc3)
> Compilation info: CC='gcc' CFLAGS='-O2 -mcpu=pentiumpro' CXX='gcc'
> CXXFLAGS='-O2 -mcpu=pentiumpro -felide-constructors' LDFLAGS=''
> ASFLAGS=''
> LIBC:
> lrwxrwxrwx 1 root root 13 Feb 12 14:24 /lib/libc.so.6 -> libc-2.3.3.so
> -rwxr-xr-x 1 root root 1504728 Oct 28 01:00 /lib/libc-2.3.3.so
> -rw-r--r-- 1 root root 2404716 Oct 28 00:46 /usr/lib/libc.a
> -rw-r--r-- 1 root root 204 Oct 28 00:08 /usr/lib/libc.so
> Configure command: ./configure '--prefix=/usr/local/mysql'
> '--localstatedir=/usr/local/mysql/data'
> '--libexecdir=/usr/local/mysql/bin' '--with-comment=MySQL Community
> Edition - Standard (GPL)' '--with-extra-charsets=complex'
> '--with-server-suffix=-standard' '--enable-thread-safe-client'
> '--enable-local-infile' '--enable-assembler' '--disable-shared'
> '--with-client-ldflags=-all-static'
> '--with-mysqld-ldflags=-all-static' '--with-readline'
> '--with-embedded-server' '--with-archive-storage-engine'
> '--with-innodb' 'CFLAGS=-O2 -mcpu=pentiumpro'
> 'CPPFLAGS=-DDEFAULT_THREAD_STACK=126976' 'CXXFLAGS=-O2
> -mcpu=pentiumpro -felide-constructors' 'CXX=gcc'
>
>


--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.7.3 - Release Date: 3/15/2005


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql [at] m.gmane.org
Peter Brawley [ Mi, 16 März 2005 17:53 ] [ ID #698928 ]

Re: subquery fails when a NOT IN operator tests a subset with NULLvalu

Peter Brawley wrote:
> Giuseppe,
>
> >mysql> select 2 not in (1,null,3);
> >+---------------------+
> >| 2 not in (1,null,3) |
> >+---------------------+
> >| NULL |
> >+---------------------+
> >1 row in set (0.00 sec)
>
> ># NOT OK
>
> Isn't that standard SQL behaviour?

Yes, it is. As I said before, I was too quick using this example, while
the problem arises only during a subquery.

Now, others have pointed out that even with subqueries this should be
considered standard behaviour, even though severa people in my workplace
agree that it does not look intuitive.



> NULL is not a value. NOT IN compares
> the values using '=' and correctly returns NULL if any value is NULL ie
> missing, eg for Oracle see http://builder.com.com/5100-6388_14-5319615.html

I see now that this mechanism is intentional.
Thanks for your link.

Giuseppe



--
Giuseppe Maxia
CTO
http://www.StarData.it
MySQL Certified Professional
__ __ __
___ / /____ ________/ /__ _/ /____ _
(_-</ __/ _ `/ __/ _ / _ `/ __/ _ `/
/___/\__/\_,_/_/ \_,_/\_,_/\__/\_,_/
Database is our business

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql [at] m.gmane.org
Giuseppe Maxia [ Mi, 16 März 2005 18:03 ] [ ID #698930 ]
Datenbanken » gmane.comp.db.mysql.general » subquery fails when a NOT IN operator tests a subset with NULL valu

Vorheriges Thema: dataKiosk 0.6 released
Nächstes Thema: upgrading problems