Introduction to Regular Expressions in MySQL

Using the LIKE clause can only get you so far when you are building search functionality into your applications. Implementing Regular Expressions will open your search functionality far beyond the most complicated LIKE statement you can create.

SELECT    City
FROM      Customers
WHERE     City REGEXP 'vancouver'
ORDER BY  City ASC;
SELECT    City
FROM      Customers
WHERE     City REGEXP '.ancouver'
ORDER BY  City ASC;

The “.” is used to replace a single character in your search string.

The querys above would work the same if you replaced REGEXP with LIKE (and used wildcards). The main difference between LIKE and REGEXP is that LIKE matches entire column values (unless wildcards are used), and REGEXP automatically searches for matches within the column values.

SELECT     City
FROM       Customers
WHERE      City REGEXP "Vancouver|Richmond"
ORDER BY   City ASC;

Searching for multiple items can be accomplished by separating the searches with the OR operator (|).

SELECT     City
FROM       Customers
WHERE      City REGEXP '[St|Saint] Johns'
ORDER BY   City ASC;

[] is used to specify AND as a operator, [12345] and [1|2|3|4|5] are the same.

It is important to note that ‘St|Saint Johns’ and ‘[St|Saint] Johns’ will not produce identical results. Below are the example queries and their respective output:

SELECT    Distinct(City)
FROM      Customers
WHERE     City REGEXP '[st|saint] john';
+--------------+
| City         |
+--------------+
| ST JOHN'S    |
| SAINT JOHN   |
| FORT ST JOHN |
+--------------+
3 rows in set (0.09 sec)
SELECT     City
FROM       Customers
WHERE      City REGEXP 'st|saint john';
+-----------------------------+
| City                        |
+-----------------------------+
| ESTEVAN                     |
| FORESTBURG                  |
| ST-ISIDORE                  |
| ESTERHAZY                   |
| EAST YORK                   |
...
| ALVINSTON                   |
| EAST PROVIDENCE             |
+-----------------------------+
148 rows in set (0.14 sec)

The ‘st|saint john’ query is requesting all cities that include the letters ‘st’ OR ‘saint john’, whereas the ‘[st|saint] john’ query requested the letters ‘st john’ AND ‘saint john’. Big difference!

[1-9] is shorthand for [123456789]

[a-e] is shorthand for [abcde]

If you are still with me on the REGEXP syntax, my next post will go further into searching for multiple instances and repetition in MySQL.


		
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: