Resources

Knowledge Base

How to Extract Data from HTML Tables

web_scraping0-1

UiPath's Web Scraping tool can extract almost any type of data from from websites and web applications. 

Scraping HTML tables is easy and requires only a few clicks. 

Here are the steps required to quickly do this:

1. Pull up the page.

The first step of this process is simply pulling up the HTML table page you want to scrape.

2. Run Web Scraping

Go to the Design menu and click on Web Scraping

web_scraping1

This will pull-up the Extract Wizard. Click Next

3. HTML Table Auto-detection

Here's an example of Google Contacts. Once the Recorder is active (blue hand cursor), click on the first cell of the table.

web_scraping2

That's all you have to do. The Recorder will automatically detect what type of data you are trying to extract. In this case the data is in HTML table format.

web_scraping3

Just click Yes.

4. Preview the Data before exporting

This will pull up a preview of the extracted data. You can set the number of results that will be extracted. If you want to extract all data from the table you can set the number to 0. 

web_scraping4-2

5. Setup page span

If the table spans multiple pages, the webscraper will continue to capture data until it reaches the end of the table. Click on Yes and click on the Next button to allow the automation to go to another page.

web_scraping4-1

6. That's it! Once you're done, you can run the automation.

The file will be extracted in a datatable and also saved in CSV format. You can pull it up through the Workspace panel. Right click on the Workflow that you are working on and click on Open Containing Folder.

web_scraping5