MySQL: select the n-th element from a delimiter separated value set

Lately I’ve been dealing with a database where a lot of values are stored as strings of concatenated values, separated by delimiters (e.g. “1,2,3,4″). I have to use these values in queries, for example, for joining with other tables. Selecting a specific value from a list of delimiter separated values can be a bit tricky.

In the beginning I was trying to do that using the SUBSTR and LOCATE functions of MySQL. This worked very well when I had only two values in the string. For example:

mysql> SET @values = 'Cat#Dog';
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT SUBSTR(@values,1,LOCATE('#',@values)-1) `1st value`;
+-----------+
| 1st value |
+-----------+
| Cat       |
+-----------+

mysql> SELECT SUBSTR(@values,LOCATE('#',@values)+1) `2nd value`;
+-----------+
| 2nd value |
+-----------+
| Dog       |
+-----------+

But if you have more than two values in your string, things start to get more complicated. For example, here’s how to get the 2nd value from a 3-value string:

mysql> SET @values = 'Square#Triangle#Hexagon';
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT SUBSTR(@values,LOCATE('#',@values)+1,LOCATE('#',
       SUBSTR(@values,LOCATE('#',@values)+1))-1) `2nd value`;
+-----------+
| 2nd value |
+-----------+
| Triangle  |
+-----------+

I thought I should find a way to generalize this so I can get the n-th value of N-value strings, but it was obvious that it was going to get unreadable and unmanageable very soon.

Then I found about an interesting MySQL function called SUBSTRING_INDEX(). According to MySQL’s documentation it “returns a substring from a string before the specified number of occurrences of the delimiter”, which at least to my ears sounds a bit obscure… After doing a couple of tests it was very clear to me to understand what it does:

mysql> SET @values = '1#2#3#4#5#6#7#8#9#10';
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT SUBSTRING_INDEX(@values,'#',4);
+--------------------------------+
| SUBSTRING_INDEX(@values,'#',4) |
+--------------------------------+
| 1#2#3#4                        |
+--------------------------------+

mysql> SELECT SUBSTRING_INDEX(@values,'#',-3);
+---------------------------------+
| SUBSTRING_INDEX(@values,'#',-3) |
+---------------------------------+
| 8#9#10                          |
+---------------------------------+

This function takes a string of values, a delimiter and a number n that specifies a position (or index). This index can be either positive or negative. If it is positive, the function returns the portion of the string that contains the first n values. If it is negative, it returns the portion of the string that contains the last ABS(n) values.

That was all I needed to write a generic piece of code that would return the n-th value. First one call SUBSTRING_INDEX to get the portion of the string that contain the first n values, and then a second call to get the last value from the subset!

In this example we extract the 4th value from a string containing animals:

mysql> SET @values = 'cat#dog#horse#parrot#gecko';
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(@values,'#',4),'#',-1);
+--------------------------------------------------------+
| SUBSTRING_INDEX(SUBSTRING_INDEX(@values,'#',4),'#',-1) |
+--------------------------------------------------------+
| parrot                                                 |
+--------------------------------------------------------+

In general, to select the N-th value from a string VALUES that contains values separated by delimiter DELIM, you have to use:

SUBSTRING_INDEX( SUBSTRING_INDEX( VALUES, DELIM, N ), DELIM, -1 )

I hope this helped! :-)

 

This entry was posted in Uncategorized. Bookmark the permalink.

11 Responses to MySQL: select the n-th element from a delimiter separated value set

  1. abcd says:

    thanks alot

  2. Worked great thanks !

    Used it to extract the ALT part for images out of a TEXT column and insert it into a new version of the web site.

  3. Leon says:

    I was searching for an improvement for my own query, which was EXACTLY this. The problem with this general solution is that when the Nth value does not exist, because there are only N-2 values, then the last value is returned, instead of NULL

  4. Daniel says:

    wow this help me alot ! thanks

  5. Rado Marius says:

    thank you very much! it works for me!

  6. Jay says:

    helped!! exactly what i was looking for.

    my query using above :
    UPDATE `product_details_csv` SET image_link=SUBSTRING_INDEX( SUBSTRING_INDEX( large_image , ‘,’ , 1 ), ‘,’, -1 )

  7. Shaka says:

    Thanks Man !! You Rock !!
    You made my day !!

  8. Sergey says:

    You best man in the world!
    Thanks a lot!!!

  9. Mike says:

    I solved the answer to Leon’s question. You would need to place a conditional around each substring query to see if the next substring matches the last substring..

    IF(SUBSTRING_INDEX( SUBSTRING_INDEX( VALUES, DELIM, 2 ), DELIM, -1 )=SUBSTRING_INDEX( SUBSTRING_INDEX( VALUES, DELIM, 1 ), DELIM, -1 ),”,SUBSTRING_INDEX( SUBSTRING_INDEX( VALUES, DELIM, 2 ), DELIM, -1 ))

  10. Juan says:

    Simple answer… the simpler the better !

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>