By Ajeet Kumar
This tutorial on regular expression targets the beginner VBA developers who have never used RE before. Regular expressions are used by advanced users and it is available in almost all high level languages. RE learnt in one language can be easily used in other programming languages. So, learning RE is highly fruitful and important skill for all developers. Important URL and books references are also given in this tutorial for further study.
Regular Expressions are powerful tool to extract desired pattern from text strings. In almost all programming languages, regular expressions, regex called in short, are used for text parsing. Perl is considered to have very versatile usage of regular expressions. They are also used in Java, PHP, JavaScript, .NET languages etc. with slight variations in the syntax.
In VBA, we can use regular expression by referencing type library called VBScript_RegExp_55. Its API documentation is small as the number of classes in the library are in a few numbers. We can search, match, find or replace the texts using regular expressions.
Regular Expression is mainly used to find pattern in text strings.
Case Sensitive Search: The search for pattern can be case sensitive as well. For example, suppose you want to search the word Apple in a text string. In case of case sensitive search, the words like APPLE, apple will be ignored by the search engine.
Global Search: If search is global, all occurences of the text will be searched by the regex engine in the text string. If the word apple occurs five times in the text string, these all five will be returned by the engine.
Global property is assigned true value for global search.
Pattern property: This property is assigned regular expression. A regular expression uses metacharacters to specify a pattern.
Metacharacters are English alphabets or other keyboard symbols but have special meaning in the context of RE. To convert a meta character into usual character, escape character forward slash / is prefixed before it.
Important URL for RE:
https://en.m.wikipedia.org/wiki/Regular_expression
https://regular-expressions.mobi/vbscript.html
https://www.robvanderwoude.com/vbstech_regexp.php
Important books:
You also can read O'Reilly book "Mastering Regular Expressions". Google Book
Some elementary examples of regex are given below used with VBA.
EXAMPLE(1)
METHOD1: LATE BINDING
Sub pSepFirstLast()
Dim RG As Object
Dim MTC As Object
Dim MT As Object
Dim STR As String
Dim Arr
STR = "Ajeet Anuja Kumud Rambha"
Set RG = CreateObject("VBScript.RegExp")
With RG
.IgnoreCase = True
.Global = True
.Pattern = "\w+"
End With
Set MTC = RG.Execute(STR)
For Each MT In MTC
Debug.Print MT.Value, MT.Length, MT.FirstIndex
Next MT
''First Word
Debug.Print "First String: " & MTC.Item(0)
''Last Word
Debug.Print "Last String: " & MTC.Item(MTC.Count - 1)
End Sub
EXAMPLE(2)
METHOD2: EARLY BINDING
Sub pSepFirstLast()
Dim RG As VBScript_RegExp_55.RegExp
Dim MTC As MatchCollection
Dim MT As Match
Dim STR As String
Dim Arr
STR = "Ajeet Anuja Kumud Rambha"
'' You can use New keyword or as follows
Set RG = CreateObject("VBScript.RegExp")
With RG
.IgnoreCase = True
.Global = True
.Pattern = "\w+"
End With
Set MTC = RG.Execute(STR)
For Each MT In MTC
Debug.Print MT.Value, MT.Length, MT.FirstIndex
Next MT
''First Word
Debug.Print "First String: " & MTC.Item(0)
''Last Word
Debug.Print "Last String: " & MTC.Item(MTC.Count - 1)
End Sub
EXAMPLE(3)
Sub pSeparateAmount()
Dim Regx As Object
Dim MatchGr As Object
Dim strCell As String
Dim Amt As Match
strCell = "eruy$12.4eyreiy765.9483ffjdk599.23 D"
Set Regx = CreateObject("VBScript.RegExp")
With Regx
.Global = True
.IgnoreCase = True
.Pattern = "\$?\d*\.\d+"
End With
Set MatchGr = Regx.Execute(strCell)
For Each Amt In MatchGr
Debug.Print Amt.FirstIndex, Amt.Length, Amt.Value
Next
End Sub
EXAMPLE(4)
Sub pSeparateLastAmount()
Dim Regx As Object
Dim MatchGr As Object
Dim strCell As String
Dim MT As Object
strCell = "eruy$12.4eyreiy765.9483ffjdk599.23 "
Set Regx = CreateObject("VBScript.RegExp")
With Regx
.Global = True
.IgnoreCase = True
.Pattern = "(\$?\d*\.\d+)(\s)+$"
End With
''Execute returns an array of matched texts
''MatchGr refers to MatchCollection class
Set MatchGr = Regx.Execute(strCell)
Debug.Print MatchGr.Item(0)
'' Amt refers to Match class object
End Sub
METHOD1: LATE BINDING
Sub pSepFirstLast()
Dim RG As Object
Dim MTC As Object
Dim MT As Object
Dim STR As String
Dim Arr
STR = "Ajeet Anuja Kumud Rambha"
Set RG = CreateObject("VBScript.RegExp")
With RG
.IgnoreCase = True
.Global = True
.Pattern = "\w+"
End With
Set MTC = RG.Execute(STR)
For Each MT In MTC
Debug.Print MT.Value, MT.Length, MT.FirstIndex
Next MT
''First Word
Debug.Print "First String: " & MTC.Item(0)
''Last Word
Debug.Print "Last String: " & MTC.Item(MTC.Count - 1)
End Sub
EXAMPLE(2)
METHOD2: EARLY BINDING
Sub pSepFirstLast()
Dim RG As VBScript_RegExp_55.RegExp
Dim MTC As MatchCollection
Dim MT As Match
Dim STR As String
Dim Arr
STR = "Ajeet Anuja Kumud Rambha"
'' You can use New keyword or as follows
Set RG = CreateObject("VBScript.RegExp")
With RG
.IgnoreCase = True
.Global = True
.Pattern = "\w+"
End With
Set MTC = RG.Execute(STR)
For Each MT In MTC
Debug.Print MT.Value, MT.Length, MT.FirstIndex
Next MT
''First Word
Debug.Print "First String: " & MTC.Item(0)
''Last Word
Debug.Print "Last String: " & MTC.Item(MTC.Count - 1)
End Sub
EXAMPLE(3)
Sub pSeparateAmount()
Dim Regx As Object
Dim MatchGr As Object
Dim strCell As String
Dim Amt As Match
strCell = "eruy$12.4eyreiy765.9483ffjdk599.23 D"
Set Regx = CreateObject("VBScript.RegExp")
With Regx
.Global = True
.IgnoreCase = True
.Pattern = "\$?\d*\.\d+"
End With
Set MatchGr = Regx.Execute(strCell)
For Each Amt In MatchGr
Debug.Print Amt.FirstIndex, Amt.Length, Amt.Value
Next
End Sub
EXAMPLE(4)
Sub pSeparateLastAmount()
Dim Regx As Object
Dim MatchGr As Object
Dim strCell As String
Dim MT As Object
strCell = "eruy$12.4eyreiy765.9483ffjdk599.23 "
Set Regx = CreateObject("VBScript.RegExp")
With Regx
.Global = True
.IgnoreCase = True
.Pattern = "(\$?\d*\.\d+)(\s)+$"
End With
''Execute returns an array of matched texts
''MatchGr refers to MatchCollection class
Set MatchGr = Regx.Execute(strCell)
Debug.Print MatchGr.Item(0)
'' Amt refers to Match class object
End Sub
No comments:
Post a Comment