Inside The Rocketship
Why We Value Algorithms and Data Structures at UiPath
This post is not one of those blog entries where we list 20 examples from our codebase that make heavy use of algorithms and data structures. Well, we have those examples as well somewhere in the blog post, but that is not the point we are trying to make here. Read on, you will see what I mean.
What do ASP .NET, Entity Framework, the Android/iOS SDKs (Software Development Kit) and node.js all have in common? They were all non-existent 20 years ago, and they will all likely not exist in 20 years' time—or at the very least they will become niche and obsolete technologies. That is because, as the world grows and evolves, so do its technological requirements. All those technologies that I just mentioned were not around two decades ago because there was no actual need for them to exist. And they will not be around in another two decades because there will not be a need for them to exist anymore by that time—the world will have different and quite possibly more complex needs come 2042.
This is not news: decades might as well be centuries in the world of software engineering, such is the pace of our industry. And yet, despite this tremendous rate of evolution, there are still some aspects of our trade that are perennial—unchanged, and just as relevant today as they were when they were first invented, with their relevance likely to continue in the foreseeable future.
I am talking, of course, about the very fundamentals of computer science: algorithms and data structures (A&DS for short). They are the building blocks that keep our modern world ticking along and are responsible for all the modern creature comforts that we take for granted. They may not be glamorous and in the spotlight, but without them our world would be a whole lot different. Examples, you say? Advanced compression algorithms that allow you to stream your favorite shows. The routing and logistics algorithms that allow for same-day deliveries of your groceries. Your car's self-parking capabilities. Secure online banking and online shopping. The power-saving routines that allow your appliances to consume less power. The fancy code that does protein folding and chemical simulations in the hopes of creating better medicine. Your microwave, deciding how much and for how long to heat your pizza.
Everything in the modern world relies on algorithms and data structures, and this reliance increases with time—as technology becomes ever more complex and ever more ingrained in our society. What is interesting, however, is that no matter what "latest-and-greatest" technology you can think of you will always find that it is underpinned by the same principles and concepts invented long ago. And, make no mistake, the next generation of "latest-and-greatest" technology will inherit the same fundamentals. It is thus almost a duty for those working in the software industry to have a good understanding of these fundamentals.
Now, make no mistake, you can get by as a computer programmer without knowing much in the way of A&DS. Given the modern frameworks and SDKs at our disposal and given the fact that a lot of software development these days is just moving data from one system to another according to some business rules, you can make a comfortable living as a software programmer without knowing much beyond what a "list" is and what the result of sorting it should look like. You can be the modern bricklayer.
Being a software engineer, however, is a different matter altogether. Engineering is not just building a house brick by brick; engineering is understanding why and how a house should be built in a certain way. What are the various trade-offs and implications of all your design decisions? What forces are acting upon each and every brick? What is the function of each brick? How will your house hold up in a couple of years' time? What can and cannot be done to extend its function and lifespan? You cannot answer these questions without understanding the fundamental forces and processes acting upon your creation. Sure, you can build a house quite easily with modern materials, but you will not really understand what keeps it upright. And regarding scaling up your construction, you might end up not with a skyscraper, but with the leaning tower of Pisa: both are relatively tall buildings, but one is a fraction of the size and complexity of the other, and I for one know at the top of which one I would feel safer.
The difference between the two buildings is not so much skill as it is knowledge. To build tall and safe, you need not only artisans to lay the bricks expertly, but also experienced engineers to determine where and how those bricks should be laid. Artisans alone will only get you so f̶a̶r̶ tall.
At the same time, all of this does not mean that you must inspect each and every brick and calculate each and every force acting upon your work. It just means that, as a good engineer, you have the capabilities and know—how to think about your work formally, when the need arises. This, you will notice, applies to all areas of engineering, not just software engineering. But limiting ourselves to the software engineering world, what does thinking formally about your work even mean? What does it involve?
There is, of course, the architectural component. The design patterns, the best practices, the coding conventions, etc. Most developers are familiar with these. Knowing these makes you an OK software engineer, but not necessarily a good one. There is also a deeper layer hidden behind these well-known and familiar concepts. It is the layer that makes the good software engineers, well, good.
There is a long discussion to be had here, but here is the gist: you need to think of your code both in terms of "classes, objects and API calls" and regarding "graphs, lists and formal operations". Each class is also a node in a graph of classes, each API call can also be formalized and thought of as a mathematical function, and the ebb and flow of what your application does is akin to the well-defined steps of an algorithm. In fact, your real-world application is just a practical application of theoretical algorithms and data structures. This means that, as a good software engineer, you can (and should!) apply the same logic and reasoning to practical software development that you apply to theoretical A&DS questions.
There is a beautiful and subtle duality here that, once you finally see, you cannot unsee. And being able to immerse yourself in this duality will turn you into a better software engineer. Does this involve memorizing by heart all the algorithms that you come across? Being able to write a working quicksort implementation on a whiteboard? Solving Olympiad-level problems while at the pub with your friends? No, no and maybe (kidding about the maybe). It is a lot simpler than that. To become a good software engineer, you just need to:
have a solid entry-level theoretical background, i.e., solid enough to utterly understand the fundamentals of computer science;
know how to apply those fundamentals to real world scenarios;
have the thought process and problem-solving skills required to do this effectively.
This does not mean that you must be a genius or a hermit living in a cave surrounded by ancient texts, performing arcane rituals. The barrier to entry is really not high at all. You will see later that there are really a couple of fundamental concepts that you need to understand—the catch is that you really need to understand them. And truly, blood sacrifice is optional.
So why go through all the effort, you ask? Good question, and this next section will try to give you some of the reasons why.
It is a common pitfall to think that A&DS are theoretical only and have no impact on real-world software engineering. Nothing could be further from the truth! To exemplify, here are just a few unexpected real-world benefits of thinking in terms of formal algorithms and data structures. These apply regardless of your tech stack and position on the engineering food chain:
Much better architecture for your projects. Classes and objects are just nodes in a graph and applying graph-related thinking to your class hierarchy can yield surprising results. The same goes for your actual code; you would be surprised how much you can simplify your code when you apply formal thinking to it—and how natural it becomes after a certain point.
Way better code reviews. Scrutinizing code with a formal eye tends to bring out hidden flaws or weak spots, that you would otherwise tend to gloss over. Just because it "looks right" does not mean it is right and applying some mathematical rigor every now and then does wonders for your codebase.
Fewer bugs, more robust code. Not only do you have better architecture and better code reviews when you apply the A&DS mindset to the real world, the code you write is more robust to begin with. Because you reason about it and try to find counterexamples and flaws, and because there is a spider sense in your brain that says, "something is not right here, keep poking". And should you decide to cut corners (let us be honest, we all have to at some point), you will have a much better understanding of the implications.
A much easier time prototyping. This may sound paradoxical but applying the "A&DS formal mindset" to your prototypes is incredibly helpful. Not only do you tend to spot dead-ends early in the game, but you can also analyze your "what if" scenarios before you even write a single line of code.
Easier, quicker, and more productive conversations with your peers. When you all speak the same fundamental language and can view code abstractly, discussions and problem solving become a lot faster and a lot more fun. You can debate your code for one hour, or you could just say something like "this is similar to this-and-that concept/problem/algorithm". Not only did you just save everyone a lot of time, but you have also gained the benefit of all the formal work and reasoning already done on whatever "this-and-that" happens to be in your case.
Now, all of these are real examples, but it does not mean that you will begin to talk to your colleagues in mathematical notation or use Boolean algebra to analyze your code. Rather, all these positive behaviors and their outcomes will manifest themselves subtly and gradually. You will most likely not even be aware of them, but I can guarantee they will have a noticeable impact on your work. Furthermore, just like brushing your teeth, they will become second nature to you—even though you will sometimes forget or consciously choose not to do them.
Simply put, knowing about algorithms and data structures changes the way you think about software for the better, and thus turns you into a better software engineer. It is that simple.
"But A&DS is a hard, niche topic that takes years of practice to understand, and I have a job and a family, and I don't really have the time"—I hear you say. Well, no, you are wrong. And here is a look at some common misconceptions surrounding A&DS that may be also holding you back.
Let us look at some common misconceptions around A&DS—those that usually tend to scare people off at the mere mention of the subject. And relax, there are not 101 of them:
It takes a lot of time and effort to achieve results. I will tackle this one with a metaphor. Ever see a baby learn to walk? At first, his frustration is high, and his progress is rather slow. But he keeps at it, and soon discovers that being able to r̶u̶n̶ w̶a̶l̶k̶ stumble from point A to point B on his own is somehow fun and useful. Fast forward a bit, and he stumbles less and less, until suddenly he is walking like there is no tomorrow. Eventually he will be capable of a mild jog or even a short run, which is a skill that comes in handy (especially if you are a kid). And these are all skills that will stay with the tiny human for the rest of his life. Of course, to go from an immobile infant to the next Usain Bolt takes a huge amount of time and effort, and it is not for everyone. But to become someone that can walk comfortably and occasionally run—that is surprisingly natural once you get over the few initial bumps and scrapes. Just keep at it.
You need to know a lot of algorithms and data structures by heart. Nope. The number and complexity of the algorithms that you know by heart simply does not matter or matters truly little. Sure, there are a few basic things that you should know by heart, like you should know your multiplication table by heart. But otherwise, there is little need to memorize stuff, and the stuff that you will memorize will come naturally, with no dedicated effort on your part. The rest you can just google.
If you cannot provide an answer quickly, you do not know your A&DS. Nope again. Being a good software engineer from an A&DS perspective is not related to how fast you can provide answers and solutions. Sure, being fast helps, but it is much more important to be right. Taking the extra time to double-check your work, analyze your methods for theoretical correctness, look for corner cases and potential problem areas, play devil's advocate, and try to optimize your solution—all these are the hallmarks of a great software engineer.
A&DS are purely theoretical, they have nothing to do with real code. This could not be further from the truth, as mentioned previously. Formalization is extremely important and being able to reason about your work not in terms of framework X or Y but regarding abstract concepts and operations is the most crucial tool in the software engineer's arsenal. It is the difference between having a satisfactory solution in a particular scenario (that may not be applicable in the real world or in one year's time) and having a satisfactory solution, period—one that scales, is robust, and has had some mathematical rigor applied to it.
I told you we have this section! Here it is! At UiPath, we really value A&DS knowledge, and we especially value it (and you!) if it can be applied to the real world. This happens a lot more often than you think. We do not sit around all day thinking in terms of abstract concepts and trying to find the best way to reverse that linked list, but at the same time we frequently hit problems that require careful thinking and mathematical solutions—anything less just will not do. Some examples:
• For authentication purposes, our activities (think of them as LEGO blocks that join to automate a process) may require different permissions, and some activities can do their job with more than one set of permissions–that is, either one of the N sets of permissions will do. We need to figure out at the scope level what the minimum set of individual permissions is that satisfies all child activities.
• Each activity package has its own set of dependencies (classic NuGet behavior). Managing these dependencies for projects where many activity packages are involved is a real blast.
• Did we mention we have a dedicated machine learning department?
• Our UI Automation framework needs to be able to identify controls quickly, reliably, and uniquely on screen. Even if these controls change position, size, or sometimes even content, the code still needs to work.
• When we refactor our APIs, we need to really think about our changes, as we need to ensure we are both backwards compatible with existing deployments (or at least have the minimum number of breaking changes) and forward-looking/extensible/able to accommodate new behaviors and requirements. This does not involve just the signature of the APIs; it also involves the actual operation of those APIs and the formal definitions of what each API should do. We need to carefully analyze and think about each change that we make and try to figure out their implications before they hit production.
Whenever something like the examples on this list comes our way, our collective conscience tends to instantly switch to the "formalize, draw parallels and analyze" way of thinking. Our algorithmic brains really shine. Those are some interesting discussions, involving both the "down to earth, code in front of your eyes" perspective, but also the "generalized, abstract and formal" representation of the problem that we want to solve. But the benefits of having a rich A&DS engineering culture at UiPath do not stop here, and indeed transcend the codebase (wow, this sounds fancy!).
The A&DS mindset also helps convey information to others. It helps you put yourself in the other party's shoes, especially when the other party may not be technical, or may be technical but not familiar with the problem space you are discussing. You build, analyze, and optimize your communication in much the same way you would build, analyze, and optimize a solution to an A&DS problem. This comes in extremely handy at UiPath, given our breakneck pace of growth and the myriad of changes and communications that need to happen to sustain that pace. It (sometimes) makes for interesting beer conversations.
Lastly, having a good grasp of A&DS also helps us switch teams and problem spaces more easily. We may not know the technologies the new team works with, or the in-depth problem space they are tackling, but as long as we can figure out the general concepts involved and draw parallels to a couple of formal scenarios that we do know from our A&DS experience, we will get up to speed fairly quickly. Because of these reasons, and many more, we really value A&DS knowledge at UiPath. It's really a core part of our engineering culture, and it's something I've personally enjoyed experiencing.
It should be obvious by now that we put a lot of weight on the A&DS interview. It should also hopefully be clear why.In our interview process, we give candidates a list of topics that we feel are indicative of a solid theoretical background regarding A&DS. We do this ahead of the actual A&DS interview. The list is not exceedingly long: mostly lists, trees, searching, sorting and traversal, plus a couple of common operations and tasks that involve these fundamental concepts. We also do not expect you to know these subjects extremely in-depth (you get bonus points if you do, though). For example, it helps if you know one or two sorting algorithms, but you do not really need to know them all. And it is OK if you cannot produce a quicksort implementation off the top of your head—as long as you can explain how the algorithm works and are able to draft a suitable solution, we will sort the details together during the interview. There are a couple of more topics that you can score "bonus points" on, but fundamentally that is all we really expect you to understand.
The catch is that we expect you to utterly understand these topics, not just recite them. That is what we screen for and value: true understanding. We will happily give you all the hints and the missing theoretical background that you ask for, should you ask for it, but we do expect that you to demonstrate a clear understanding of the topic at hand.
As a continuation of this, we try to test your problem-solving skills and your ability to draw parallels and apply concepts and knowledge that you may already have to a problem that you have not seen before. So, we might fail a candidate that knows the theory behind a topic but cannot apply it, but we will happily extend an offer to a candidate that does not know the theory, asks for some hints, then deduces the theory ad-hoc and solves the problem.
Last but not least, since knowledge that is stuck in your head does not really benefit the team, we are looking for solid teamwork and collaboration skills. As a potential new addition to our team, we are genuinely curious about what your voice can bring to the conversation and how you think and communicate regarding (somewhat) abstract concepts that may be difficult to convey.
Hopefully, this article has helped you get a sneak peek into our engineering culture and see algorithms and data structures in a different light. If you have seen them in this light, then please contact us—we have a job opening waiting for you on our careers website.