MYSQL REGEX Details
MYSQL REGEX Details
Forum Donate
Ilenia Magoni
Being able to do complex queries can be really useful in SQL.
In this article, we'll look at how you can use the Contains String query.
SQL Patterns
SQL patterns are useful for pattern matching, instead of using literal comparisons. They have a
more limited syntax than RegEx, but they're more universal through the various SQL versions.
SQL patterns use the LIKE and NOT LIKE operators and the metacharacters (characters that stand
for something other than themselves) % and _.
Character Meaning
% Any sequence of characters
_ Exactly one character
You can use these characters in a wide variety of use-cases. Here are some examples:
example pattern usage
re% Strings that begin with a specific substring
%re Strings that end with a specific substring
%re% Strings that have a specific substring anywhere in the string
%re_ Strings that have a specific substring at a specific position from the end¹
__re% Strings that have a specific substring at a specific position from the beginning²
¹ (in the example, second to last and third to last characters are determined)
² (in the example, third and fourth characters are determined)
Example query
SELECT name FROM planets
WHERE name LIKE "%us";
Where planets is a table with the data of the solar system's planets.
With this query you would get the below names of the planets that end with "us".
name
Venus
Uranus
The NOT LIKE operator finds all strings that do not match the pattern.
With this query you get all planets whose names don't contain the letter u, like below.
name
Earth
Mars
Let's take the example used with LIKE and let's use it here too.
You can use this operator with NOT in front to have the opposite effect. This is how you would
write the example we used before using SIMILAR TO instead:
RegEx in SQL
What about if you need more complex pattern matching? Well, for that you need to use Regular
Expressions.
What is RegEx?
RegEx on its own is a powerful tool that allows for flexible pattern recognition. You can use
RegEx in many languages like PHP, Python, and also SQL.
RegEx lets you match patterns by character class (like all letters, or just vowels, or all digits),
between alternatives, and other really flexible options. You will see them below.
Keep in mind that the freeCodeCamp curriculum presents RegEx for JavaScript, so there is not a
perfect match, and we need to convert the syntax. Still, it gives you a good overview of basic
RegEx features, so let's follow this curriculum so you can get a good idea of what RegEx can do.
The easiest way to use RegEx it's to use it to match an exact sequence of characters.
For example the regex "Kevin" will match all strings that contains those letters in that exact
sequence, as "Kevin", "Kevin is great", "this is my friend Kevin" and so on.
The wildcard period . matches any character, for example "hu." would match anything that
contains an h followed by an u followed by any character, such as "hug", "hum", "hub", "huh",
but also "husband", "churros", "thumb", "shuttle" and so on.
You can use a character class (or character set) to match a group of characters, for example
"b[aiu]g" would match any string that contains a b, then one letter between a, i and u, and then
a g, such as "bug", "big", "bag", but also "cabbage", "ambigous", "ladybug", and so on.
You have seen above how you can match a group of characters with character classes, but if you
want to match a long list of letters that is a lot of typing.
To avoid all that typing, you can define a range. For example you can match all letters between a
and e with "[a-e]".
A regex like "[a-e]at" would match all strings that have in order one letter between a and e,
then an a and then a t, such as "cat", "bat" and "eat", but also "birdbath", "bucatini", "date",
and so on.
You can also use the hyphen to match numbers. For example "[0-5]" would match any number
between 0 and 5, including 0 and 5.
You can also combine different ranges together in a single character set. For example "[a-z0-
9]" would match all letters from a to z and all numbers from 0 to 5.
You can also use a character set to exclude some characters from a match, these sets are called
negated character sets.
You can create a negated character set by placing a caret character ( ^) after the opening bracket
of the character class.
For example "[^aeiou]" matches all characters that are not vowels. It would match strings like
"rythm" where no character is a vowel, or also "87 + 14".
If you need to match a specific character or group of characters that can appear one or more
times, you can use the character + after this character.
For example, "as+i" would match strings that contain one a followed by one or more s followed
by one i, such as "occasional", "assiduous" and so on.
If you can use + to match a character one or more times, there is also * to match a character zero
or more times.
A regular expression such as "as*i" would match, other than "occasional" and "assiduous" also
strings such as "aide".
Until now you have seen ways to match anywhere in the string, without the option to say where
the match must be.
We use the character ^ to match the beginning of a string, for example a regex such as "^Ricky"
would match "Ricky is my friend", but not "This is Ricky".
Just as there's a way to match the beginning of a string, there is also a way to match the end of a
string.
You can use the character $ to match the end of a string, so for example "story$" would match
any string that ends with "story", such as "This is a never ending story", but not a string such a
"Sometimes a story will have to end".
You can combine the two characters ^ and $ to match a whole string.
So, taking one of the previous examples, writing "b[aiu]g" can match both "big" and "bigger",
but if instead you want to match only "big", "bag" and "bug", adding the two beginning and
ending string characters ensures that there can't be other characters in the string: "^b[aiu]g$".
This pattern would match only "big", "bag" and "bug", and it doesn't match "bigger" or
"ambiguous".
Match All Letters and Numbers
You have seen before how to match characters with a character class.
There are a few predefined classes, called POSIX classes, that you can use instead. So if you
want to match all letters and numbers like with "[0-9a-zA-Z]" you can instead write
"[[:alphanum:]]".
If instead you want to match anything that is not a letter of a number, you can use the alphanum
POSIX class together with a negated character set: "[^[:alphanum:]].
You can also use a POSIX class to match all numbers instead of using "[0-9]", like so:
"[[:digit:]]".
You can use the digit POSIX class with a negated character set to match anything that is not a
number, like so: "[^[:digit:]]".
Match Whitespace
You can match whitespace with the POSIX class "[[:blank:]]" or "[[:space:]]". The
difference between these two classes is that the class blank matches only spaces and tabs, while
space matches all blank characters, including carriage returns, new lines, form feeds, and
vertical tabs.
You can match anything that is not a space or tab with "[^[:blank:]]".
And you can match anything that is not a whitespace, carriage return, tab, form feed, space, or
vertical tab with "[^[:space:]]".
You have seen before how to match one or more or zero or more characters. But sometimes you
want to match a certain range of patterns.
For example, if your pattern is "Oh{2,4} yes", then it would match strings like "Ohh yes" or
"Ohhhh yes", but not "Oh yes" or "Ohhhhh yes".
You can also use the quantity specifier other than for a range to specify an exact number of
matches. You can do this by writing a single number inside the curly brackets.
So, if your pattern is "Oh{3} yes", then it would match only "Ohhh yes".
If you want to check for groups of characters using regular expressions you can do so using
parenthesis.
For example, you may want to match both "Penguin" and "Pumpkin", you can do so with a
regular expression like this: "P(engu|umpk)in".
RegEx patterns
pattern description
^ beginning of string
$ end of string
. any character
( ) grouping characters
[abc] any character inside the square brackets
[^abc] any character not inside the square brackets
a|b|c a OR b OR c
* zero or more of the preceding element
+ one or more of the preceding element
{n} n times the preceding element
{n,m} between n and m times the preceding element
Posix classes
In the table below you can see the posix classes we saw above, as well as some others that you
can use to create patterns.
Remember that when using a POSIX class, you always need to put it inside the square brackets
of a character class (so you'll have two pair of square brackets). For example, "a[[:digit:]]b"
matches a0b, a1b and so on.
RegEx operators
RegEx operators are usually case insensitive, meaning that they don't distinguish between
uppercase and lowercase letters. So for them, a is equivalent to A. But you can change this
default behaviour, so don't take it for granted.
Operator Description
REGEXP Gives true if it matches the given pattern
NOT REGEXP Gives true if the string doesn't contain the given pattern
Posix operators
The other kind of operators you could have available are POSIX operators. Instead of being
keywords, these are represented with punctuation, and can be case sensitive or insensitive.
operator description
~ case sensitive, true if the pattern is contained in the string
operator description
!~ case sensitive, true if the pattern is not contained in the string
~* case insensitive, true if the pattern is contained in the string
!~* case insensitive, true if the pattern is not contained in the string
Let's see how to use these operators and RegEx patterns in a query.
Example query 1
For this first example, you want to match a string in which the first character is an "s" or "p" and
the second character is a vowel.
To do this, you can use the character class [sp] to match the first letter, and you can use the
character class [aeiou] for the second letter in the string.
You also need to use the character to match the start of the string, ^, so all together you'll write
"^[sp][aeiou]".
You write the query below to get back the list of users whose names match the pattern.
And if the default case insensitive behaviour was changed, you would need to write a pattern that
allows both uppercase and lowercase letters, like "^[spSP][aeiouAEIOU]" and use it in the
query as below:
Or with the POSIX operator, in this case you could use the case insensitive operator, ~* and you
would not need to write both upper case and lower case letters inside a character class. You
could write the query as below.
As the operator is by definition case insensitive, you don't need to worry about specifying both
uppercase and lowercase letters in the character class.
These queries would give back a table with results similar to below:
name
Sergio
name
PAUL
samantha
Seraphina
Example query 2
As a second example, let's say you want to find a hexadecimal color. You can use the POSIX
class [:xdigit:] for this – it does the same as the character class [0-9a-fA-F].
You could combine them using character grouping and | to have one single RegEx pattern that
matches both, and use it in the query as below:
color
#341
#00fa67
#FF00AB
The POSIX class [:xdigit:] already includes both uppercase and lowercase letters, so you
would not need to worry about if the operator is case sensitive or not.
Depending on the size of your tables, a Contains String query can be really resource-intensive.
Be careful when you're using them in production databases, as you don't want to have your app
stop working.
Conclusion
The Contains String queries are really useful. You have learned how to use them in this article,
and you've seen a few examples.
Hopefully you have added a new tool to your arsenal, and you enjoy using it! Just be careful not
to crash your app.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT
Ilenia Magoni
If you read this far, tweet to the author to show them you care.
Learn to code for free. freeCodeCamp's open source curriculum has helped more than 40,000
people get jobs as developers. Get started
ADVERTISEMENT
freeCodeCamp is a donor-supported tax-exempt 501(c)(3) charity organization (United States
Federal Tax Identification Number: 82-0779546)
Our mission: to help people learn to code for free. We accomplish this by creating thousands of
videos, articles, and interactive coding lessons - all freely available to the public. We also have
thousands of freeCodeCamp study groups around the world.
Donations to freeCodeCamp go toward our education initiatives, and help pay for servers,
services, and staff.
Trending Guides
What is a Framework? What Do CS Majors Do? Discord Update Failed Center an Image in CSS
What is the MVC Model? JavaScript replaceAll() Python Switch Statement Python
string.replace() What is a Relational DB? Split a String in Python
SQL HAVING What is OOP? HTML textarea NVM for Windows Git Revert File GROUP BY
in SQL 2D Array in Java How to Install NVM Percentages in Excel JavaScript Timestamp
Git List Remote Branches Git Delete Remote Branch Software Developer Career Three Dots
Operator in JS How to Format Dates in JS
Remove Item from Array JS Dual Boot Windows + Ubuntu Python Round to 2 Decimals String
to Int in JavaScript What’s the .gitignore File?
About Alumni Network Open Source Shop Support Sponsors Academic Honesty Code of
Conduct Privacy Policy Terms of Service Copyright Policy