Thursday, January 26, 2017

Excel VBA- Regular Expressions

By Ajeet Kumar
This tutorial on regular expression targets the beginner VBA developers who have never used RE before. Regular expressions are used by advanced users and it is available in almost all high level languages. RE learnt in one language can be easily used in other programming languages. So, learning RE is highly fruitful and important skill for all developers. Important URL and books references are also given in this tutorial for further study.
Regular Expressions are powerful tool to extract desired pattern from text strings. In almost all programming languages, regular expressions, regex called in short, are used for text parsing. Perl is considered to have very versatile usage of regular expressions. They are also used in Java, PHP, JavaScript, .NET languages etc. with slight variations in the syntax.

In VBA, we can use regular expression by referencing type library called VBScript_RegExp_55. Its API documentation is small as the number of classes in the library are in a few numbers. We can search, match, find or replace the texts using regular expressions.
Regular Expression is mainly used to find pattern in text strings.

Case Sensitive Search: The search for pattern can be case sensitive as well. For example, suppose you want to search the word Apple in a text string. In case of case sensitive search, the words like APPLE, apple will be ignored by the search engine.

Global Search: If search is global, all occurences of the text will be searched by the regex engine in the text string. If the word apple occurs five times in the text string, these all five will be returned by the engine.
Global property is assigned true value for global search.

Pattern property: This property is assigned regular expression. A regular expression uses metacharacters to specify a pattern.

Metacharacters are English alphabets or other keyboard symbols but have special meaning in the context of RE. To convert a meta character into usual character, escape character forward slash / is prefixed before it.

Important books:

You also can read O'Reilly book "Mastering Regular Expressions". Google Book
Some elementary examples of regex are given below used with VBA.
EXAMPLE(1)
METHOD1: LATE BINDING
Sub pSepFirstLast()
    Dim RG As Object
    Dim MTC As Object
    Dim MT As Object
    Dim STR As String
    Dim Arr
    STR = "Ajeet Anuja Kumud Rambha"
    Set RG = CreateObject("VBScript.RegExp")
    With RG
        .IgnoreCase = True
        .Global = True
        .Pattern = "\w+"
    End With
    Set MTC = RG.Execute(STR)
    For Each MT In MTC
        Debug.Print MT.Value, MT.Length, MT.FirstIndex
    Next MT
    ''First Word
    Debug.Print "First String: " & MTC.Item(0)
    ''Last Word
    Debug.Print "Last String: " & MTC.Item(MTC.Count - 1)
End Sub

EXAMPLE(2)
METHOD2: EARLY BINDING
Sub pSepFirstLast()
    Dim RG As VBScript_RegExp_55.RegExp
    Dim MTC As MatchCollection
    Dim MT As Match
    Dim STR As String
    Dim Arr
    STR = "Ajeet Anuja Kumud Rambha"
'' You can use New keyword or as follows
    Set RG = CreateObject("VBScript.RegExp")
    With RG
        .IgnoreCase = True
        .Global = True
        .Pattern = "\w+"
    End With
    Set MTC = RG.Execute(STR)
    For Each MT In MTC
        Debug.Print MT.Value, MT.Length, MT.FirstIndex
    Next MT
    ''First Word
    Debug.Print "First String: " & MTC.Item(0)
    ''Last Word
    Debug.Print "Last String: " & MTC.Item(MTC.Count - 1)
End Sub

EXAMPLE(3)
Sub pSeparateAmount()
    Dim Regx As Object
    Dim MatchGr As Object
    Dim strCell As String
    Dim Amt As Match
    strCell = "eruy$12.4eyreiy765.9483ffjdk599.23  D"
    Set Regx = CreateObject("VBScript.RegExp")
    With Regx
        .Global = True
        .IgnoreCase = True
        .Pattern = "\$?\d*\.\d+"
    End With
    Set MatchGr = Regx.Execute(strCell)
    For Each Amt In MatchGr
        Debug.Print Amt.FirstIndex, Amt.Length, Amt.Value
    Next
End Sub
EXAMPLE(4)
Sub pSeparateLastAmount()
    Dim Regx As Object
    Dim MatchGr As Object
    Dim strCell As String
    Dim MT As Object
  
    strCell = "eruy$12.4eyreiy765.9483ffjdk599.23  "
    Set Regx = CreateObject("VBScript.RegExp")
  
    With Regx
        .Global = True
        .IgnoreCase = True
        .Pattern = "(\$?\d*\.\d+)(\s)+$"
    End With
    ''Execute returns an array of matched texts
    ''MatchGr refers to MatchCollection class
    Set MatchGr = Regx.Execute(strCell)
    Debug.Print MatchGr.Item(0)
    '' Amt refers to Match class object
End Sub

No comments:

Post a Comment

Hot Topics