Web Scraping Structured Data.Get News
This sample gets the first 3 news stories from Mashable about a specific domain.
Steps to automate
- Display a list of categories.
- Extract first 3 news from that category.
- Write them into a text file.
- Use "Input Dialog" activity to allow the user to choose a news category.
- Use a "FlowDecision" activity to check if the user chose "None". If so, display a friendly message.
- Otherwise, get the selected category and open the proper website (mashable.com/Category/).
- After the website is opened, extract the news title and the URL of the stories. Close the tab.
- For each extracted news, navigate to the corresponding url and extract the content (without photos). Use Design->Screen Scraping Wizard and indicate on screen the region to be extracted. Save the result in a text document. The text document will contain the title and the content of each news story.
- Allow the user to select a news category again.