Inside The Rocketship
Tech Stories: Document Understanding. A Testing Story from UiPath Engineers in Cluj
Florisca Levente and Adrian Popa are both Software Test Engineers in the UiPath R&D office in Cluj, Romania.
Sit down, get cozy, grab a big cup of tea and prepare for a story. A story that has it all: an enchanted forest, castles, princesses and, on top of it all, we’ll throw in some technical notions like software testing, RPA and document understanding. No, this isn’t the usual fairy tale, it’s a technical tale in which we present some challenges we faced when testing a document understanding RPA solution built by the UiPath team from Cluj. However, it also won’t be your regular technical article either. We wrapped all the ideas we want to share in the three parts that form a story: the setting, the characters, and the plot.
This being said, let’s begin building the setting. This is the “Once upon a time…” part and our kingdom is our testing vision. Why is that, you might ask. Well, because all our testing efforts are consistent with the way we see the role of a tester in a team, the way we understand automation and how it combines with the manual testing effort.
Imagine a land of confusion where the ocean meets the sky and the software development meets testing. In ancient times the sole role of a tester in a team was to run tests. In modern software development, a tester should do more. The “QA” from quality assurance has turned into quality assistance. We want to build good quality software instead of breaking the software and to prevent defects instead of finding bugs. For this we, as many of you probably do, use automation, a tool, to help us test. Like any tool, even automation needs a human. It also needs a knight in shining armor to be able to call it testing, else, it’s just a script, a robot.
Now, let us proceed, go closer and visualize our castle. Our castle is RPA (robotic process automation), everything we build, we do to enhance our infrastructure, speed, and reliability. In RPA we work with robots. A robot is not the first terminator model, for now, we build only digital ones. If you like, the beginning of Skynet, who we give orders to do the repetitive work instead of us. With our robots, we are taking RPA to the next level.
The action of our story takes place in the Document Understanding chamber of the castle. The purpose of the Document Understanding solution is to classify and extract data from structured (e.g. fixed forms) and semi-structured (e.g. invoices, receipts) document types. The steps one needs to take to build a Document Understanding automation in UiPath are:
First, we define a taxonomy using the Taxonomy Manager wizard. The taxonomy contains all the document types which will be processed and the fields of interest for each document type.
Then we digitize the input image or PDF file using the Digitize Document activity, along with an OCR engine of your choice.
Now we classify the input document by dragging one or more classifiers into the Classification Scope and configuring them.
After classification, we can move on to data extraction. Here, as above, we have a Data Extraction Scope in which we drag and then configure extractors.
Finally, using the Validation Station UI tool we can visualize the extracted results and correct or approve them.
You can get more details on using the Document Understanding activities from the official documentation, here.
Now that we have the setting, we can get acquainted with the characters. First of all, the heroes of this story are team princes and princesses that are ambitious and believe in providing the best quality for the end-users. They have to slay a herd of dragons, which we can identify from the above description: UI tools, like the Taxonomy Manager and Validation Station, OCR engines, extractors, and classifiers, to name only the most important ones.
Now comes the fun part, the plot of the story. Here we want to focus on two challenges we have faced: testing the efficiency of a new extractor and the always interesting automation testing.
How do we test an extractor? What do we look for? How do we know how a new extractor performs in comparison with an existing one? These are some questions we came up with. To answer them we drafted some steps:
We needed to generate an ideal result based on some representative data sets.
Extract data from documents that belong to the same data set, each document having a characteristic (different rotation and skew angles, resolutions, font types, colors, etc.)
Compare the extracted data against the one from the ideal result and come up with a measure of the extractor’s efficiency.
Lucky for us, we had the tools to implement the above steps in our own castle. Thus, we created a solution in UiPath Studio: we generate an ideal result by using the Validation Station and manually setting the values for each field. After that, we process the documents that belong to a data set, we extract data from the desired fields and compare them with the ideal results. The comparison is done by calculating the Levenshtein distance for each field. This distance gives us the number of edits that are needed to make a String identical to another. A few computations later we determine the average percentage of the extracted values and so we know the correctness of the extracted data.
Since we are building a process automation tool, what better way to test it than using our own software to test it. We use RPA to automatize flows in order to make sure our end result is according to our expectations. The advantages are multiple, among others, we provide results with using our own end to end solution, no mocked data, and no simulations. The biggest challenge is to build up a reliable framework that you can use. As in other automated frameworks, we went for the segregation of the 3 parts in tests, the GWT (Given, When, Then) or AAA (Arrange, Act, Assert), both patterns are there to help you to write correct tests. The maintenance and refactor effort is the same as on a code-based framework. For me it’s a new and exciting way of working, it’s a challenge to reimagine everything, reuse the previous knowledge and transform it into an RPA based framework. The only thing I miss sometimes is the freedom coding can bring when implementing a framework.
Working on some products which require to be tested with multiple input types you sometimes have the reflex to duplicate your tests. For this type of testing, in coding, we have a solution, called data tables. We’ve managed to implement something similar, and read the input and assert data from a table set on an excel sheet. Using this architecture, we make sure to run the same test on different input/assert sets without duplicating the test itself. Further challenges will probably overcome us, but we will rise from the ashes, as real heroes do, and in the end, live happily ever after.