While many of us have relied on text functions like FIND, LEFT, and SUBSTITUTE in Excel to perform operations on text strings. It is important to understand that these functions come with their limitations. You can use Visual Basic for Applications (VBA) in Excel to surpass this limitation. It can provides enhanced flexibility and time-saving benefits.
In this article, we will be exploring the following topics in detail –
Let’s dive into the article to understand and unlock the potential of Regex in Excel!
Download the Excel Workbook below to follow along and understand how to use Regex in Excel –
download excel workbookRegEx-Function-in-Excel.xlsm
Regex in Excel
Regex is a sequence of characters consisting of letters, numbers, special characters, operators, and constructs. It is a powerful tool that enables users to describe and identify patterns within a text. You can define rules using regex to extract, replace, or match patterns based on certain criteria.
For example, [from–to] means you can match any character mentioned in the bracket.
- [0-9] – Any single digit from 0 to 9
- [a-z] – Any single lowercase letter
- [A-Z] – Any single uppercase letter
Although Excel lacks a built-in regex function, you can use regex with the help of Visual Basic for Applications (VBA).
How to Use VBA for Regex
You can use VBA to create custom Regex Functions in Excel. Let’s see how it can be done:
STEP 1: Press Alt + F11 to open the window for Microsoft Visual Basic for Applications.
Or, Go to Developer > Visual Basic.
STEP 2: Go to Insert > Module and copy-paste the following code in it.
STEP 3: Download this file and copy-paste the code into the module section –
STEP 4: Press Ctrl + S to save the Excel Workbook.
STEP 5: In the dialog box, select No to save the WorkBook as Macro enabled.
STEP 6: In the Save as dialog box, select Excel Macro-Enabled Workbook from Save as type dropdown. Click Save.
This Macro-Enabled Workbook will now have three custom Regex Functions saved and ready to be used. Let’s take a look at them.
Match Patterns using Regex
The custom function that has been created using VBA to match patterns is used MatchPatternUsingRegex. The syntax of this function is –
=MatchPatternUsingRegex(input_range, pattern, [match_case])
where,
- input_range – the range of cells you want to search for the pattern. Required.
- pattern – the regular expression pattern you want to match. Required.
- match_case – specify whether the pattern should be case-sensitive or not (default is True). Optional.
In this example, we want to check if the email address mentioned in column A belongs to the domain “@gmail.com”.
The pattern that we will be using to check the same is –
[a-zA-Z0-9._%+-]+@gmail\.com
- [a-zA-Z0-9._%+-]+ This part matches one or more occurrences of any of these lowercase letters (a-z), uppercase letters (A-Z), digits (0-9), period (.), underscore (_), percent sign (%), plus sign (+), or hyphen/minus sign (-). The pattern will match these characters before the @ sign.
- gmail\.com This part will match the exact text string “@gmail.com”.
Follow the steps below to match patterns using Regex in Excel –
STEP 1: Enter the formula.
=MatchPatternUsingRegex(
STEP 2: Select the cell containing the text you want to check.
=MatchPatternUsingRegex(A2,
STEP 3: Type the pattern you want to check.
=MatchPatternUsingRegex(A2,”[a-zA-Z0-9._%+-]+@gmail\.com”)
In the result, you can see that all the email addresses that belong to Gmail provide the value TRUE, else FALSE.
Replace Patterns using Regex
The custom function that has been created using VBA to replace patterns is used ReplaceUsingRegex. The syntax of this function is –
=ReplaceUsingRegex(text, pattern, replacement)
where,
- text – the cells that contain the text string. Required.
- pattern – the regular expression pattern you want to replace. Required.
- replacement – the text that you want to replace the pattern with. Required.
In this example, we have a list of names but it contains unwanted characters like exclamation mark (!), @, full stop (.), digits (0-9), etc. We will use this function to replace all unwanted characters with an empty string.
The pattern that we will be using is “[^a-zA-Z/s]”. It means any character that is not mentioned in the bracket i.e. lowercase letters (a-z), uppercase letters (A-Z), and space (/s).
Follow the steps below to replace patterns using Regex in Excel –
STEP 1: Enter the formula.
=ReplaceUsingRegex(
STEP 2: Select the cell containing the text string.
=ReplaceUsingRegex(A2,
STEP 3: Type the pattern you want to replace.
=ReplaceUsingRegex(A2,”[^a-zA-Z/s]”
STEP 4: Type an empty string.
=ReplaceUsingRegex(A2,”[^a-zA-Z/s]”,””)
In the result, you can see that all the characters except lowercase letters (a-z), uppercase letters (A-Z), and space (/s) have been replaced by an empty string.
Extract Patterns using Regex
The custom function that has been created using VBA to extract patterns from text strings is ExtractUsingRegex. The syntax of this function is –
=ExtractUsingRegex(text, pattern)
where,
- text – the cells that contain the text string. Required.
- pattern– regular expression pattern you want to extract. Required.
In this example, we have a list of names and want to extract the last name. Here, we will be using the pattern “\w+$” that will help us extract the end of the string.
- /w – indicates any character including uppercase letters, lowercase letters, digits, and underscore.
- + – indicates one or more characters
- $ – indicates the end of the string.
Follow the steps below to extract patterns using Regex in Excel –
STEP 1: Enter the formula.
=ExtractUsingRegex(
STEP 2: Select the cell containing the text string.
=ExtractUsingRegex(A2,
STEP 3: Type the pattern you want to extract.
=ExtractUsingRegex(A2,”\w+$”)
In the result, we can see that the end of the string i.e. the last name has been extracted.
Conclusion
Regex is an extremely powerful tool that can be] used to make data analysis tasks simpler and streamlined. While Excel provides built-in text functions, they have limitations.
VBA in Excel is a flexible and time-saving alternative for Excel users. VBA for regex can be used for matching patterns, replacing patterns, and extracting patterns. By using custom regex functions created through VBA, users can apply complex pattern matching and manipulation to text data in Excel.
John Michaloudis is a former accountant and finance analyst at General Electric, a Microsoft MVP since 2020, an Amazon #1 bestselling author of 4 Microsoft Excel books and teacher of Microsoft Excel & Office over at his flagship MyExcelOnline Academy Online Course.