Oracle equivalent of MySQL group_concat

MySQL has a very useful function called GROUP_CONCAT() that concatenate the aggregated values of a group (GROUP BY). With Oracle 11G, you can use listagg to do the same, but Oracle 10G does not have an equivalent function.

If you are still working with Oracle 10G, this is the simplest way I found to concatenate aggregated values.

The separator used CANNOT be present in any of the concatenated values . It can be a problem if you deal with large text. I used “|” because I know it’s a character that cannot be present in my values.

Original Data:

Oracle Group Concat






Concatenated Data

Oracle Group Concat 2





Install RMySQL on Ubuntu

RMySQL is the database interface and MySQL driver for R. If you don’t already have MySQL libraries and try to install it, you could get an error like:

This is because the installation don’t find the required MySQL C API libraries. You can manually download them and specify the path as arguments, but a much simpler way is to use apt-get:

Then, in your R environment you can reinstall the package!

Quick tips to troubleshoot the first installation of Pentaho BI 5.0.1 CE with MySQL

Last week, I successfully installed Pentaho BI Server Community Edition. It wasn’t as easy as I expected. I did follow a really good tutorial from David Fombella at  Like many people, I did have the famous 404 error. It’s not a big deal, there is no error that can’t be explained with the log files, right ?

Pentaho Logo

Consider the following as an extension of the tutorial

Tip #1

Be careful at step #5 and don’t forget to delete tomcat/conf/Catalina/localhost/pentaho.xml . Keep in mind that every time you modify tomcat/webapps/pentaho/META-INF/context.xml, pentaho.xml has to be deleted because it’s an exact copy of context.xml.

Tip #2

Don’t copy/paste the configuration provided in the tutorial for pentaho-solutions/system/jackrabbit/repository.xml  (Step #6) because the CLASS attribute of the all the nodes is missing ! The XML file already has examples that you can copy past with all the nodes and the proper attributes.

Pentaho.xml is not the only config file that is duplicated. I spent HOURS searching the node with the missing class attribute  in JackRabbits’s repository.xml (yup, I did copy paste from the tutorial..). I ended up reverting all my modifications and I still had the missing class error in the log file, over and over.  The fact is: if you do start pentaho with the missing class attribute in repository.xml, it will create a folder named pentaho-solutions/system/jackrabbit/repository/ which seems to have generated/copied files based on the configuration. The problem is it generates a file with a missing class attribute ! It took me too long to realize pentaho-solutions/system/jackrabbit/repository/ IS NOT PART OF THE DEFAULT INSTALLATION. It’s generated. If you screw up the config like me, DELETE THE ENTIRE FOLDER.

Tip #3

Another thing you have to know is about the folders tomcat/work/Catalina and tomcat/temp. Both folders contain temporary file that can be removed. But the folders themselves must exist! For an unknown reason, my tomcat/temp folder was missing and that prevents Jackrabbit to start properly with the not-so-useful error:


So, if you are getting problems with the installation like many people, including myself have, just execute this every time you change a config file. It will allow to start with a clean environment.

I hope it could save you time !

Oh! One last thing..don’t try to run it on a 1Gb RAM server.. it won’t work. :)



This WordPress blog runs on MariaDB

Recently, my hosting company Funio has replaced MySQL for MariaDB on their shared hosting infrastructure. Some friends working there told me they get better performance with MariaDB than their previous MySQL installation.

MariaDB is a drop-in replacement of MySQL. This WordPress blog was first installed with MySQL, and suddenly and automagically runs on MariaDB. As a DBA, I was excited to use all the new features… but I realized that I’m pretty limited on Funio’s database infrastructure since I’m just a simple user…! So, just for fun, I changed the engine of all my tables to ARIA.  It’s the least I can do.

ARIA is a crash-safe alternative to MyISAM and according to plans, it will become Transactional in the next major release. I don’t expect it to be as fast, reliable and robust than InnoDB (or XtraDB.. the altered version of InnoDB developed by Percona and included in MariaDB), but it’s definitely a good “hybrid” solution between MyISAM et InnoDB.

I didn’t experience issue since the changes, so if you are a WordPress user, don’t be afraid to give it a try !





Tuning tips every PDI user using MySQL should know: increase table output speed by 7000%

I found a thread on Pentaho Forum about Table Output Performance.  It gives a really easy way to speed up your table output.. A LOT! Some people have been able to reach 84k rows/sec using the method. It is not a joke. It works.  It really does.

All you need to do is to give the MySQL JDBC driver 3 parameters:

It allows to insert many rows with a single statement instead of one row per statement. Also, useCompression reduce the traffic between the server and pentaho.

I tested it myself.

I used a 869mb CSV file (5248714 rows) from my Laptop and transferred it to an out-of-the-box MySQL installation on a cheap VPS server (1 CPU, 1gb ram) with a poor network in Europe (yep.. across the ocean). The expected performance is a very slow transfer.

I set the number of copies to start of my Table Output to 4 and commit size to 1000.

Here is the result without the options: I reached 52 rows/seconds.


The same test, but with the 3 connection parameters: 3674 rows/second ! Wow ! 7000% faster !


Impressive isn’t it !?



Three different ways to create a CSV with MySQL

There are 2 well known ways to create a CSV using MySQL. But there is a technique I’ve been using for many years that most blogs and forums don’t talk about. I do believe it’s worth showing the technique I use since it’s faster.

First, there is the “standard” way:

The second technique is by using mysql client and sed

And the technique I use is by using the CSV storage engine.

I now have a file named test_csv_table.CSV in my MySQL datadir! If you don’t know where is your datadir, use the statement:

Obviously, it requires to have the privileges to read the mysql datadir. Also, keep in mind the CSV engine doesn’t support nullable columns.

What I like with that third technique is I don’t have to remember the complex syntax of the other techniques. It’s pretty straight forward.

Oracle SQL Update Statement with INNER JOIN

Oracle (the database) does not support joins in the UPDATE statement. This is one of many tiny annoying fact about Oracle. Here are 2 different ways to overcome that limitation.

The subqueries


The updatable view


MySQL provides, in my opinion, a  clearer syntax to update


I can now refer to my blog as an extension of my brain :)