VBA Web Scraping

What is Excel VBA Web Scraping?

Excel VBA (Visual Basic for Applications) web scraping refers to the process of extracting data from websites using VBA code in Microsoft Excel.

VBA Web scraping involves fetching and extracting data from web pages and VBA web scraping without a browser. By writing VBA code, you can instruct Excel to navigate to a website, retrieve specific information from the HTML source code, and then store that data in your Excel workbook. See a simple example of performing Web Scraping in VBA with the example below.

Excel VBA Web Scraping 1

It opens a session of Internet Explorer by opening the link “google.com.” It waits for 2 seconds due to the Application.Wait function and then closes the session. The object used to open Internet Explorer is set to nothing to prevent any cache.

Excel VBA Web Scraping 1-1

As seen above, the Google tab opens, loads in, and then closes after 2 seconds.

Key Takeaways
  • VBA coding is used for VBA web scraping in Excel.
  • Internet Explorer or Selenium with EdgeDriver are commonly used for web automation.
  • VBA interacts with the Document Object Model (DOM) to extract and manipulate web page elements. XPath or element IDs are used to locate and interact with HTML elements.
  • We implement waiting mechanisms to ensure elements load in VBA Web Scraping before interacting.

How to perform Web Scraping in Microsoft Edge, Google Chrome and Mozilla Firefox?

Internet Explorer is generally used for Web Scraping in Excel VBA. There is no native support for other browsers, such as Google Chrome and Microsoft Edge, without third-party applications. To be able to perform VBA web scraping Microsoft Edge, you need to install Selenium Basic and select the support you need.

How to perform Web Scraping 1
How to perform Web Scraping 1-1

After installing, go to the VBA Editor and go to Tools > References in the VBA Toolbar.

How to perform Web Scraping 1-2

Select “Selenium Type Library” to be able to do Web Scraping in Excel VBA.

How to perform Web Scraping 1-3

How to Scrape Website Data using VBA?

Learn how to scrape data from the web in VBA by following the steps below.

Step 1: Once you open the Excel Workbook, in the title bar, choose the “Developer” icon and click on it.

How to Scrape Website Data using VBA 1

In Developer, select “Visual Basic.” It opens the VBA Editor. Initially, it will be empty.

How to Scrape Website Data using VBA 1-1

In the Editor, select “Insert” > “New Module” present in the toolbar. Now, you are ready to start coding in Excel VBA.

How to Scrape Website Data using VBA 1-2

Step 2: Name the subroutine to check the presence of a word in any website and say whether it exists.

How to Scrape Website Data using VBA 1-3

Step 3: Create a VBA Object and use it to open the Internet Explorer.

How to Scrape Website Data using VBA 1-4

Step 4: Store the URL you want to search the word in a string variable.

How to Scrape Website Data using VBA 1-5

For this example, the VBA tutorial’s search result in Google is copied.

How to Scrape Website Data using VBA 1-6

Copy the URL and paste it as a string in the variable.

Step 5: Define the text you want to search on the webpage in a string variable.

How to Scrape Website Data using VBA 1-7

Step 6: Set the visibility of Internet Explorer as true. Then, make it go to the URL mentioned earlier.

How to Scrape Website Data using VBA 1-8

Step 7: Wait for 2 seconds for the page to load properly using the VBA Wait function.

How to Scrape Website Data using VBA 1-9

Step 8: Store the HTML of the webpage in a string variable. It will be used to search for the text.

How to Scrape Website Data using VBA 1-10

Step 9: Check for the presence of the word on the webpage with an If condition.

How to Scrape Website Data using VBA 1-11

The InStr function searches for the occurrence of the text to be searched within the text stored in the string, starting from the first character (position 1). The function returns the position of the first occurrence of searchText in pageSource. If not found, it returns 0.

vbTextCompare: It is a constant in the InStr function, which specifies a text comparison mode for InStr. In this case, it performs a case-insensitive search.

If the text is found, it then prints the result in the Immediate tab.

Step 10: In case the word is not found, and the InStr function returns 0, then the Else part of the If-Else condition is executed telling you that the word does not exist by printing it in the Immediate tab.

How to Scrape Website Data using VBA 1-12

Step 11: Close the Internet Explorer session after everything is done. After that, you can set the Object to “Nothing” to empty any cache in the object, if any.

How to Scrape Website Data using VBA 1-13

Code:

Sub SearchTextInPage()

    Dim objIE As Object

    Set objIE = CreateObject(“InternetExplorer.Application”)

    Dim url As String

    url = “https://www.google.com/search?q=vba+tutorial&oq=vba+tut&gs_lcrp=EgZjaHJvbWUqDQgBEAAYkQIYgAQYigUyBggAEEUYOTINCAEQABiRAhiABBiKBTINCAIQABiRAhiABBiKBTIHCAMQABiABDIHCAQQABiABDIHCAUQABiABDIHCAYQABiABDIHCAcQABiABDINCAgQABiRAhiABBiKBdIBCDQyODZqMGoxqAIAsAIA&sourceid=chrome&ie=UTF-8”

    Dim searchText As String

    searchText = “Tutorial”

    objIE.Visible = True

    objIE.navigate url

    Application.Wait Now + TimeValue(“00:00:02”)

    Dim pageSource As String

    pageSource = objIE.document.body.outerHTML

    If InStr(1, pageSource, searchText, vbTextCompare) > 0 Then

        Debug.Print “Search text ‘” & searchText & “‘ found in the page.”

    Else

        Debug.Print “Search text ‘” & searchText & “‘ not found in the page.”

    End If

    objIE.Quit

    Set objIE = Nothing

End Sub

Step 12: Click “F5” or the “Run” icon to open an Internet Explorer session with the URL you put.

How to Scrape Website Data using VBA 1-14

After it closes, the result is printed in the Immediate tab.

How to Scrape Website Data using VBA 1-15

Examples

See the different ways you can use VBA to perform Web Scraping below.

Example #1

Given a website, you want to find the total number of backlinks or link references present in it. You can do this by finding the HTML element ‘a’, which is used as a href to refer to another website. You can simply add a counter to it after that.

Step 1: Define a subroutine to find the number of links in a webpage.

VBA Web Scraping Example 1

Step 2: Create an Internet Explorer session by declaring a VBA Object.

VBA Web Scraping Example 1-1

Step 3: Get the URL you want to search for the number of links.

VBA Web Scraping Example 1-2

Step 4: Make the viewing of the session “True” and go to the URL specified using the “navigate” function.

VBA Web Scraping Example 1-3

Step 5: Wait for the webpage to finish loading and to continue the loop till then.

VBA Web Scraping Example 1-4
  • objIE.Busy: Checks if Internet Explorer is still busy performing operations. It is True while a webpage is still loading or performing other activities.
  • objIE.readyState <> 4: Checks if the readyState property of Internet Explorer is not equal to 4. The readyState property indicates the loading status of the document. A readyState of 4 means the document has completely loaded.
  • DoEvents: Allows the operating system to process other events in the queue. It prevents the VBA script from becoming unresponsive during the waiting period.
  • Loop: Continues looping until both conditions (objIE.Busy and objIE.readyState <> 4) are false, indicating that Internet Explorer has finished loading the webpage.

Step 6: Define another object and store the number of elements with the tag ‘a’ in href. It means that if you click on these, you will be redirected to another link.

VBA Web Scraping Example 1-5

Step 7: Print the number of links embedded in the webpage in the Immediate tab.

VBA Web Scraping Example 1-6

Step 8: Close the window and set the Object as “Nothing”.

VBA Web Scraping Example 1-7

Code:

Sub NumberOfLinks()

    Dim objIE As Object

    Dim link As Object

    Set objIE = CreateObject(“InternetExplorer.Application”)

    url = “https://www.excelmojo.com/vba/”

    objIE.Visible = True

    objIE.navigate url

    Do While objIE.Busy Or objIE.readyState <> 4

        DoEvents

    Loop

    Set link = objIE.document.getElementsByTagName(“a”)

    Debug.Print “Number of links: ” & link.Length

    objIE.Quit

    Set objIE = Nothing

End Sub

Step 9: Press “F5” to run the subroutine. The output is displayed as shown below:

VBA Web Scraping Example 1-8
VBA Web Scraping Example 1-9

Example #2

In this example, you will navigate through multiple sites through Web Automation and then close the application.

Step 1: Create a subroutine to go through multiple websites automatically.

VBA Web Scraping Example 2

Step 2: Create a new Internet Explorer session with a VBA Object.

VBA Web Scraping Example 2-1

Step 3: Define the first URL. You can go through the websites in any order.

VBA Web Scraping Example 2-2

Step 4: Define the second URL.

VBA Web Scraping Example 2-3

Step 5: Make the Internet Explorer session visible by setting it as “True”. Then navigate to the URL of your choice.

VBA Web Scraping Example 2-4

Step 6: Keep looping until the webpage is fully loaded, and then wait for 2 seconds on the website.

VBA Web Scraping Example 2-5

Here, the Busy and readyState values are used to check if the webpage is fully loaded or not. It is looped so that VBA does not go inactive.

Step 7: Then navigate to the second URL.

VBA Web Scraping Example 2-6

Step 8: Similarly, wait until the webpage has loaded and wait for 2 seconds.

VBA Web Scraping Example 2-7

The Busy component is used to check if the webpage is still loading.

readyState is used to check if the value 4 has been reached or not. If it isn’t, that means that the website is still loading. It means that despite any errors further in the subroutine, the Internet Explorer instance will still be running.

Step 9: Close the Internet Explorer session and set the Object in VBA to “Nothing” to clear it.

VBA Web Scraping Example 2-8

Code:

Sub WebScrapingExample()

    Dim objIE As Object

    Set objIE = CreateObject(“InternetExplorer.Application”)

    Dim url As String

    url = “https://www.google.com/”

    Dim url1 As String

    url1 = “https://www.github.com/”

    objIE.Visible = True

    objIE.navigate url1

    Do While objIE.Busy Or objIE.readyState <> 4

        DoEvents

    Loop

    Application.Wait Now + TimeValue(“00:00:02”)

    objIE.navigate url

    Do While objIE.Busy Or objIE.readyState <> 4

        DoEvents

    Loop

    Application.Wait Now + TimeValue(“00:00:02”)

    objIE.Quit

    Set objIE = Nothing

End Sub

Step 10: Click the green arrow button to start the VBA web scraping session. First, the GitHub page will be opened. Then, the Google page will be opened. After 2 seconds, the Window will close.

VBA Web Scraping Example 2-9
VBA Web Scraping Example 2-10

Example #3

In this example, you will go through a website to print all the neighborhoods of Manhattan available on this website. With the correct HTML tag selection, copy the XPATH by pressing “Ctrl+Shift+I” to start the Inspect element. Then, copy all the content in the tags and print them in an Excel sheet using VBA.

Step 1: Create a sub-procedure to print the neighborhoods in Manhattan, NYC.

VBA Web Scraping Example 3

Step 2: Like in the previous examples, create a new Internet Explorer session with a new Object.

VBA Web Scraping Example 3-1

Step 3: Get the link to get all the names of the neighborhoods in Manhattan.

VBA Web Scraping Example 3-2

Step 4: Show the Internet Explorer session by setting “Visible” as “True.” Then, take the browser to the URL using the “navigate” function.

VBA Web Scraping Example 3-3

Step 5: Wait for the page to load by freezing the application for 5 seconds with the Application.Wait function.

VBA Web Scraping Example 3-4

Step 6: Get the HTML tag containing the locations of Manhattan and store them in an Object.

VBA Web Scraping Example 3-5

You can get the result element by opening “Inspect Element” and copying the XPath version of the table of contents.

VBA Web Scraping Example 3-6

If you copy it, you get the XPath value as ‘//*[@id=”folderNav”]/nav/ul’.

Using the “querySelector” component, write the id as #folderNav, then the content after the slashes ‘/’ with ‘>.’

querySelector(“#folderNav > nav > ul”)

Step 7: Define string and variant variables.

VBA Web Scraping Example 3-7

Step 8: Check if the result element is not empty using an If-condition.

VBA Web Scraping Example 3-8

Get the text from the XPath element and store it in a String variable. Store this text in the Variant variable where a line with the constant, vbNewLine, separates each element.

Step 9: Save the elements and dimensions of the matrix from the first row to the highest value in the result array. It is only if the array isn’t empty.

VBA Web Scraping Example 3-9

Step 10: Add a new worksheet to add the list of neighborhoods in Manhattan.

VBA Web Scraping Example 3-10

Step 11: Initialize an iterative variable and start a for-loop from the Lower Bound to the Upper Bound of the array.

VBA Web Scraping Example 3-11

LBound returns the first value of the array.

UBound returns the last value of the array.

It is beneficial in cases where you don’t know the size of your VBA array.

Step 12: Paste the elements in the array into the cells of the worksheet you’ve newly created. Then, continue the FOR-loop till all the elements are done.

VBA Web Scraping Example 3-12
VBA Web Scraping Example 3-13

Step 13: Declare an Else condition in case there are no elements found in the webpage to prevent errors. Then, close the Internet Explorer session and set the Object to “Nothing” to clear its cache.

VBA Web Scraping Example 3-14
VBA Web Scraping Example 3-15

Code:

Sub ScrapeNYCNeighborhoods()

    Dim objIE As Object

    Set objIE = CreateObject(“InternetExplorer.Application”)

    Dim url As String

    url = “https://www.cityneighborhoods.nyc/manhattan-neighborhoods”

    objIE.Visible = True

    objIE.navigate url

    Application.Wait Now + TimeValue(“00:00:05”)

    Dim resultElement As Object

    Set resultElement = objIE.document.querySelector(“#folderNav > nav > ul”)

    Dim resultText As String

    Dim resultArray As Variant

    If Not resultElement Is Nothing Then

        resultText = resultElement.innerText

        resultArray = Split(resultText, vbNewLine)

        If UBound(resultArray) > 0 Then

            ReDim Preserve resultArray(1 To UBound(resultArray))

        End If

        ‘ Print the list into one column in Excel

        Dim ws As Worksheet

        Set ws = ThisWorkbook.Sheets.Add

        Dim i As Integer

        For i = LBound(resultArray) To UBound(resultArray)

            ws.Cells(i, 1).value = resultArray(i)

        Next i

    Else

        MsgBox “Element not found.”

    End If

    objIE.Quit

    Set objIE = Nothing

End Sub

Step 14: Run the above subroutine to print the neighborhoods in Manhattan, NYC into your Excel sheet.

VBA Web Scraping Example 3-16

Go to the Excel sheet newly created. All the locations in Manhattan are printed successfully.

VBA Web Scraping Example 3-17

Important Things To Note

  • Check the website’s terms of service and robots.txt file for scraping permissions.
  • Avoid sending too many requests in a short period to prevent IP blocking. This is common while performing VBA Web Scraping Microsoft Edge.
  • Avoid circumventing login or access controls without proper authorization.
  • Follow website rules and terms; do not proceed if scraping is explicitly prohibited.

Frequently Asked Questions (FAQs)

1) Which websites can I scrape using VBA?

You can scrape websites using VBA for publicly accessible data, respecting the site’s terms of service. Avoid scraping sensitive or private information, and be aware that some websites may have measures to prevent scraping.

2) How do I handle dynamic content when web scraping with VBA?

Use Explicit Waits: Implement waiting mechanisms like Application.Wait to ensure elements load before interacting.
Leverage ReadyState: Check the ReadyState property to confirm the page has fully loaded.
Inspect AJAX Calls: Monitor network requests and response patterns for dynamic content changes.
Utilize Selenium: If VBA alone isn’t sufficient, consider integrating Selenium for more advanced interactions. It is not possible to do VBA web scraping Chrome without selenium when you are searching for Element ID’s in XPATH.
Adjust Timing: Experiment with different delay durations to synchronize with dynamic updates.

3) How do I handle authentication when scraping with VBA?

● Use SendKeys or getElementById to input credentials directly for the VBA web scraping click button. Simply get the element’s HTML ID by class or by its XPATH to click on buttons.
● Embed the WebBrowser control in a UserForm for interactive login.
● If supported, include credentials in the URL (not recommended for security reasons).
● If available, use an API for authenticated data retrieval.
● For complex authentication flows, consider integrating Selenium with VBA.

4) Can VBA handle AJAX requests during web scraping?

VBA itself has limited support for handling AJAX requests directly. However, you can handle AJAX requests in VBA web scraping by:
● Using ReadyState Property
Leveraging Wait Functions: Use Application.Wait or custom wait functions to pause execution until the AJAX request finishes.
Inspecting Network Traffic: Analyze the network traffic using browser developer tools to understand AJAX requests and their responses.

This has been a guide to VBA Web Scraping. Here we learn How to Scrape Website Data using VBA along with examples. You can learn more from the following articles –

Reader Interactions

Leave a Reply

Your email address will not be published. Required fields are marked *