What is RegEx in Excel VBA?
RegEx is a pattern-matching technique that allows you to manipulate text based on complex patterns of characters. For example, RegEx in Excel VBA can be used to find and replace text, extract specific parts of a string, validate input data, and more. RegEx patterns are defined using a combination of special characters and syntax representing different characters and operations.
Let’s look at an example. Suppose we have an Excel dataset with the word “world.”
Now we want to replace it with “universe” using RegEx in Excel VBA. This can be accomplished using the following VBA code:
The resulting output looks like this.
Table of contents
Key Takeaways
- RegEx in VBA allows you to search for and manipulate text using complex patterns.
- To enable RegEx in VBA, you must add a reference to the Microsoft VBScript Regular Expressions library.
- The .Match() method of the RegExp object returns information about the first match, while the .Execute() method returns information about all matches.
- When using RegEx in VBA, thoroughly test your patterns and consider the performance implications for large amounts of data.
How to Enable RegEx in Excel VBA?
Before using RegEx in Excel VBA, you must enable the RegEx library. To do this, follow these steps:
- Open the VBA Editor by pressing Alt + F11 in Excel.
- In the VBA Editor, go to Tools → References.
- Scroll down the list of available references and check the box next to “Microsoft VBScript Regular Expressions 5.5”.
- Click OK to close the References dialog box.
- Once you have enabled the RegEx library, you can begin using RegEx in Excel VBA code.
Excel VBA – All in One Courses Bundle (35+ Hours of Video Tutorials)
If you want to learn Excel and VBA professionally, then Excel VBA All in One Courses Bundle (35+ hours) is the perfect solution. Whether you’re a beginner or an experienced user, this bundle covers it all – from Basic Excel to Advanced Excel, Macros, Power Query, and VBA.
VBA RegEx Pattern
RegEx patterns in VBA are defined using a combination of special characters and syntax representing different characters and operations. Here, are some of the most used RegEx characters and their description.
Syntax | Description |
---|---|
. | Dot. It matches any single character except a new line. |
* | Asterix. It matches zero or more occurrences of the preceding character or group. |
+ | Plus. It matches one or more occurrences of the preceding character or group. |
? | Question. It matches zero or one occurrence of the preceding character or group. |
^ | Caret. It matches the beginning of a string. |
$ | Dollar. It matches the end of a string. |
[] | Square brackets. It matches any character within the brackets. |
[^] | Negated square brackets. It matches any character that is not within the brackets. |
{} | Parentheses. Groups characters together, allowing for more complex pattern matching. |
| | Pipe. Matches one of several alternatives. |
- Here are some examples of RegEx patterns in VBA:
Syntax | Name of Syntax | Explanation |
---|---|---|
“apple” | Literal String | Matches the literal string “apple”. |
“a.ple” | Dot | Matches any single character except a newline in place of the dot, so it fits “apple”, “abple,” “a_ple,” etc. |
“a*ple” | Asterisk | Matches zero or more occurrences of the preceding character in place of the asterisk, so it matches “aple,” “apple,” “aaaaaple,” etc. |
“a+ple” | Plus | Matches one or more occurrences of the preceding character in place of the plus, so it matches “aple,” “apple,” “aaaaaple,” but not “ple.” |
“ap?le” | Question | Matches zero or one occurrence of the preceding character in place of the question mark, so it matches “aple” or “apple.” |
“^apple” | Caret | Matches “apple” at the beginning of a string. |
“apple$” | Dollar Sign | Matches “apple” at the end of a string. |
“[abc]pple” | Square Brackets | Matches any character within the brackets, so it matches “apple,” “bpple,” or “cpple.” |
“[^abc]pple” | Negated Square Brackets | Matches any character that is not within the brackets, so it matches any five-letter word that ends with “pple,” except “apple,” “bpple,” or “cpple.” |
“(apple|orange|banana)” | Pipe | Matches one of several alternatives, so it matches “apple,” “orange,” or “banana.” |
Properties and Methods of RegEx Object
In VBA, RegEx patterns are defined using a RegExp object. The RegExp object has several properties and methods that you can use to control its behavior and perform operations on the text you’re searching. Here are some of the most used properties and methods of the RegExp object:
Properties of VBA RegEx Object
- .Pattern: Gets or sets the regular expression pattern.
- .IgnoreCase: Gets or sets a value indicating whether the regular expression is case-insensitive.
- .Global: Gets or sets a value indicating whether the regular expression is global (matches all occurrences) or not.
- .MultiLine: Gets or sets a value indicating whether the regular expression should match across multiple lines.
Methods of RegEx Object
- .Test(strInput): Returns True if the regular expression pattern matches any part of the input string; otherwise False.
- .Execute(strInput): Executes the regular expression pattern against the input string and returns a MatchCollection object containing all matches.
- .Replace(strInput, strReplacement): Replaces all occurrences of the regular expression pattern in the input string with the specified replacement string and returns the modified string.
- .Match(strInput): Searches the input string for the first occurrence of the regular expression pattern and returns a Match object containing information about the match.
- .Matches(strInput): Searches the input string for all occurrences of the regular expression pattern and returns a MatchCollection object containing information about each match.
Example
- Let us look at an example of using RegEx in Excel VBA:
In this example, we will use regular expressions in Excel VBA to extract email addresses from a text column. We will write a code to search for email addresses within each cell of the specified column, then output only those email addresses into a separate column.
It is a simple example of how RegEx in Excel VBA can identify and extract specific data from a larger set of unstructured text. Column A has unstructured text with phone numbers and email addresses, and we shall write the RegEx in Excel VBA code to get only the email addresses in the next column.
- Step 1: Press Alt + F11 in Excel to open the VBA Editor.
- Step 2: Make sure to enable the RegEx library. Please refer to the above section, “How to Enable RegEx in Excel VBA?”
- Step 3: Insert a new module by clicking “Insert” from the top menu and selecting “Module.”
- Step 4: Type the following code in the module.
Declare the necessary variables – RegExp object, MatchCollection, Match, string pattern, input string, range object for column A and a cell object to loop through each cell in the range.
- Step 5: Define the regular expression pattern for email IDs.
Explanation:
The above code is a pattern that checks if a given text is an email address.
- “[a-zA-Z0-9._%+-]+” matches one or more characters that are either letters (a to z, both uppercase and lowercase), numbers (0 to 9), or characters.
- “_%+-“ matches the local part of an email address, which may contain these characters.
- “@” matches the @ symbol in an email address, separating the local and domain parts.
- “[a-zA-Z0-9.-]+” matches one or more characters that are either letters (a to z, both uppercase and lowercase), numbers (0 to 9), or characters – or “.” It may be in the domain name of an email address
- “\.” matches a literal dot (.) in the domain name of an email address.
- “[a-zA-Z]{2,}” matches two or more characters that are either letters (a to z, both uppercase and lowercase). It matches the top-level domain of an email address, which is usually a two-letter country code or a generic domain like com, net, org, etc.
- Step 6: Get the range of cells in column A from the first row to the last row with data and loop through each cell in the range and execute the RegEx pattern against the input string to extract the email IDs.
This code is used to find all occurrences of a specific pattern in each text stored in the variable “strInput.”
- The code “regEx.Pattern = strPattern” sets the pattern for the search.
- The code “Set matches = regEx.Execute(strInput)” executes the search and returns a collection of all matches found in the text
So, after running these two lines of code, the “matches” collection contains all the pattern occurrences in the text.
- Step 7: Loop through each match found and print the email ID in the adjacent cell in column B.
- Step 8: Finally, for each match found in the input string, the email ID is printed in the adjacent cell in column B using the Offset() method. The loop then moves to the next cell in the range, and the process repeats until all cells in the range have been processed.
- Step 9: Save the module by clicking “File” from the top menu, then select “Save” and give it a name. Make sure to save the file as a “Macro-Enabled Workbook.”
- Step 10: Press Alt+F8 to open the “Macro” dialog box.
- Step 11: Select the “ExtractEmails” macro and click on Run.
- Step 12: After running the VBA code, check the values in column B to see that all the email addresses extracted from the text in column A are now displayed in column B.
- Here is the full code:
Sub ExtractEmails()
Dim regex As New RegExp
Dim matches As MatchCollection
Dim match As match
Dim strPattern As String
Dim strInput As String
Dim rng As Range
Dim cell As Range
‘ Define the RegEx pattern for email IDs
strPattern = “[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+.[a-zA-Z]{2,}”
‘ Get the range of cells in column A
Set rng = Range(“A1:A” & Cells(Rows.Count, “A”).End(xlUp).Row)
For Each cell In rng
‘ Get the input string
strInput = cell.Value
‘ Execute the RegEx pattern against the input string
regex.Pattern = strPattern
Set matches = regex.Execute(strInput)
‘ Loop through each match and print the email ID in column B
For Each match In matches
cell.Offset(0, 1).Value = match.Value
Next match
Next cell
End Sub
Important Things to Note
When using RegEx in Excel VBA, there are some essential things to keep in mind:
- When using the VBA RegEx Test method, remember that it returns a Boolean value of True if the pattern matches the input string and otherwise, False. This method can be helpful when you want to check whether a string matches a pattern without necessarily needing to extract or replace any specific input portions.
- When using VBA Regex Replace function with the .Replace() method of the RegExp object, and you can refer to the capture groups in the pattern using $1, $2, etc., in the replacement string. For example, suppose you have a pattern with two capture groups and want to replace the matched text with the second capture group followed by the first capture group. You can use the replacement string “$2 $1” to achieve this.
- When working with VBA regex sub-matches, it is essential to note that the sub-match index starts at 0, with 0 representing the entire matched string. The first sub-match has an index of 1; the second has an index of 2, and so on. It’s also worth noting that if a sub-match is optional and doesn’t match anything, its value in the sub matches array will be an empty string.
myString = RegEx.Replace(myString, “hello”, “hi”)
Frequently Asked Questions (FAQs)
The .Test() method of the RegExp object returns True if the regular expression pattern matches any part of the input string; otherwise, it returns False. The .Execute() method executes the regular expression pattern against the input string and returns a MatchCollection object containing all matches.
In other words, the .Test() only tells you whether the pattern matches, while the .Execute() gives you more detailed information about the matches.
The .Match() method of the RegExp object searches the input string for the first occurrence of the regular expression pattern and returns a Match object containing information about the match. The .Execute() method executes the regular expression pattern against the input string and returns a MatchCollection object containing all matches.
The .Match() only returns information about the first match, while .Execute() returns information about all matches.
In RegEx, some characters have special meanings, such as the dot (.), which matches any character, and the asterisk (*), which matches zero or more occurrences of the preceding character. To match these characters literally, you must add a backslash (). For example, to match a period (.), you would use the pattern \.
Download Template
This article must be helpful to understand the RegEx in Excel VBA, with its pattern, properties, methods and example. You can download the template here to use it instantly.
Recommended Articles
This has been a guide to RegEx in Excel VBA. Here we explain how to enable RegEx, its pattern, properties & methods, example & downloadable excel template. You can learn more from the following articles –
Leave a Reply