Bulk Insertion Performance

--20cf30549a8fe6e08f049768570d
Content-Type: text/plain; charset=ISO-8859-1

Hi All,

I have a data warehouse infrastructure with following configuration :
- MySQL 5.0 MyISAM + InndoDB enabled (XAMPP Distribution)
- Windows 2003 64 bit data center edition
- Java Runtime 6 - 32 bit version

And have ETL running data warehouse process. Reading is impressive, 12,000
rows per second. But writing with only 10 columns (integer and varchar
combinations) takes 3,000 rows / second.

Is there a way to configure writing to have a better performance ?

Thanks,

Feris

--20cf30549a8fe6e08f049768570d--
Feris Thia [ Mi, 15 Dezember 2010 01:51 ] [ ID #2051663 ]

Re: Bulk Insertion Performance

On Dec 14, 2010, at 6:51 PM, Feris Thia wrote:

> Hi All,
>
> I have a data warehouse infrastructure with following configuration :
> - MySQL 5.0 MyISAM + InndoDB enabled (XAMPP Distribution)
> - Windows 2003 64 bit data center edition
> - Java Runtime 6 - 32 bit version
>
> And have ETL running data warehouse process. Reading is impressive, =
12,000
> rows per second. But writing with only 10 columns (integer and varchar
> combinations) takes 3,000 rows / second.
>
> Is there a way to configure writing to have a better performance ?

Feris,

*How* are you writing, via batch statements with rewriting, or directly, =
or via LOAD DATA INFILE? It seems you're off by about a factor of 10-20x =
from what I've seen performance-wise for writes.

-Mark
--
Mark Matthews
Principal Software Developer - MySQL Enterprise Tools
Oracle
http://www.mysql.com/products/enterprise/monitor.html








--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=3Dgcdmg-mysql-2 [at] m.gmane.o rg
Mark Matthews [ Mi, 15 Dezember 2010 02:10 ] [ ID #2051664 ]

Re: Bulk Insertion Performance

--20cf3054a1576eeea1049768c3e2
Content-Type: text/plain; charset=ISO-8859-1

Hi Mark,

On Wed, Dec 15, 2010 at 8:10 AM, Mark Matthews <mark.matthews [at] oracle.com>wrote:

> Feris,
>
> *How* are you writing, via batch statements with rewriting, or directly, or
> via LOAD DATA INFILE? It seems you're off by about a factor of 10-20x from
> what I've seen performance-wise for writes.
>

I'm using ETL mean - for this case, it is a java application name Kettle
(Pentaho Data Integration). And it use JDBC connection.

Is it a JDBC driver configuration ?


> -Mark
> --
>

Thanks,

Feris

--20cf3054a1576eeea1049768c3e2--
Feris Thia [ Mi, 15 Dezember 2010 02:21 ] [ ID #2051665 ]

Re: Bulk Insertion Performance

On Dec 14, 2010, at 7:21 PM, Feris Thia wrote:

> Hi Mark,
>
> On Wed, Dec 15, 2010 at 8:10 AM, Mark Matthews =
<mark.matthews [at] oracle.com> wrote:
> Feris,
>
> *How* are you writing, via batch statements with rewriting, or =
directly, or via LOAD DATA INFILE? It seems you're off by about a factor =
of 10-20x from what I've seen performance-wise for writes.
>
> I'm using ETL mean - for this case, it is a java application name =
Kettle (Pentaho Data Integration). And it use JDBC connection.
>
> Is it a JDBC driver configuration ?

Feris,

I don't know what Kettle is doing under the hood, but if it's doing =
addBatch(), executeBatch(), then adding "rewriteBatchedStatements=3Dtrue" =
to your MySQL JDBC URL should probably help quite a bit.

-Mark
--
Mark Matthews
Principal Software Developer - MySQL Enterprise Tools
Oracle
http://www.mysql.com/products/enterprise/monitor.html








--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=3Dgcdmg-mysql-2 [at] m.gmane.o rg
Mark Matthews [ Mi, 15 Dezember 2010 22:50 ] [ ID #2051673 ]

Re: Bulk Insertion Performance

--90e6ba6e838add3d5204978216fc
Content-Type: text/plain; charset=ISO-8859-1

Hi Mark,

On Thu, Dec 16, 2010 at 4:50 AM, Mark Matthews <mark.matthews [at] oracle.com>
wrote:

> Feris,
>
> I don't know what Kettle is doing under the hood, but if it's doing
> addBatch(), executeBatch(), then adding "rewriteBatchedStatements=true" to
> your MySQL JDBC URL should probably help quite a bit.
>

It works. By having rewriteBatchedStatements=true in the jdbc url it
increases. Now it performs an average 4500 rows / second. Thanks Mark.

Regards,

Feris

--90e6ba6e838add3d5204978216fc--
Feris Thia [ Do, 16 Dezember 2010 08:34 ] [ ID #2051707 ]

Re: Bulk Insertion Performance

--90e6ba53acf4f71848049783534b
Content-Type: text/plain; charset=ISO-8859-1

Hmm, interesting. What does this do, exactly ? Can something similar be
applied to non-jdbc connections, too ?

On Thu, Dec 16, 2010 at 8:34 AM, Feris Thia <
milis.database [at] phi-integration.com> wrote:

> Hi Mark,
>
> On Thu, Dec 16, 2010 at 4:50 AM, Mark Matthews <mark.matthews [at] oracle.com>
> wrote:
>
> > Feris,
> >
> > I don't know what Kettle is doing under the hood, but if it's doing
> > addBatch(), executeBatch(), then adding "rewriteBatchedStatements=true"
> to
> > your MySQL JDBC URL should probably help quite a bit.
> >
>
> It works. By having rewriteBatchedStatements=true in the jdbc url it
> increases. Now it performs an average 4500 rows / second. Thanks Mark.
>
> Regards,
>
> Feris
>



--
Bier met grenadyn
Is als mosterd by den wyn
Sy die't drinkt, is eene kwezel
Hy die't drinkt, is ras een ezel

--90e6ba53acf4f71848049783534b--
Johan De Meersman [ Do, 16 Dezember 2010 10:03 ] [ ID #2051710 ]

Re: Bulk Insertion Performance

--20cf304346c6a419390497ba0389
Content-Type: text/plain; charset=ISO-8859-1

Hi Johan,

On Thu, Dec 16, 2010 at 4:03 PM, Johan De Meersman <vegivamp [at] tuxera.be>wrote:

> Hmm, interesting. What does this do, exactly ? Can something similar be
> applied to non-jdbc connections, too ?
>

I'm not quite sure... but will try to trace it. Will ask this in another
thread.

FYI, when I try it using FIFO mechanism in Linux. It increases into 20,000
rows / second insertion.

Regards,

Feris

--20cf304346c6a419390497ba0389--
Feris Thia [ So, 19 Dezember 2010 03:18 ] [ ID #2051835 ]
Datenbanken » gmane.comp.db.mysql.general » Bulk Insertion Performance

Vorheriges Thema: MySQL 5.5.8 GA's library files have not version information by default?
Nächstes Thema: mysql on linux tutorials written by me for community