✨ Register today: We can test in production: An introduction to shifting testing right. ✨

The Building Blocks Of A UI Test Automation Framework

Learn about what a UI Automation framework is and how it's built

by Mark Winteringham
Sep 23, 2021
5 min read

The Building Blocks Of A UI Test Automation Framework image

When we are new to user interface (UI) test automation, we generally think of an automation framework as a large unit of software. But dig a little deeper, and you will find that, like most software, a UI test automation framework is a collection of libraries that work together.

This is an important distinction to keep in mind, especially for those who are new to the subject. It allows us to break down the different actions that occur in a framework and learn how each piece works on its own and how the pieces contribute to a framework as a whole.

Whether you’re using an ‘off the shelf’ solution such as Cypress, or building your own framework, all contain similar ‘core’ components. A deeper appreciation of the function of each component will enable us to get the most out of each piece and create more robust and reliable automated checks.

So let’s take a look at four core ‘building blocks’ that make up a UI test automation framework:

The package manager
The runner
The UI driver
The assertion library

We’ll dive a bit deeper into each of these parts to understand better what they are and how they work — starting with the package manager.

Managing Multiple Libraries: The Package Manager

Each ‘building block’ of a framework contains different features that work together to create our framework. To access these features we require specific libraries for each ‘block’. That means we need a way of loading each of these libraries into the framework. This is usually handled by a package manager.

Package managers declare the libraries we want to use as well as their version numbers. The version number is important when different versions of libraries might have changes, bugs, or features that we don’t need in our framework.

For example, with Maven, a package manager for Java, you can add ‘dependencies’ to a POM.xml file to determine which libraries you want and the version of each library to use:

<dependencies>
    <dependency>
        <groupId>org.seleniumhq.selenium</groupId>
        <artifactId>selenium-java</artifactId>
        <version>3.141.59</version>
    </dependency>
</dependencies>

Not only do our package managers give us access to all the libraries we need, but we can also easily reference plugins, scripts, shortcuts, and much more. Plugins can solve many common problems, such as determining the version of a language to use, finding our automated checks, sending reporting metrics, and so on.

Package managers that we could use include:

Maven (Java)
RubyGems (Ruby)
Pip (Python)
NPM (NodeJS)
NuGet (C#)

Running Your Automated Checks: The Runner

Our package manager makes available to our code the functionality of libraries and plugins, but it is usually not responsible for the organisation and running of automated checks. For this, we require a runner.

A runner reads files that we create in order to ‘know’ which checks to run. For example,

We first create a file with the word Test in the filename (the keyword Test helps our runner to identify which files to run)

Inside the file, we’ll add a function or method with some code to make it clear to the runner that this is an automated check to run. For example, in Java, we can add the following in which the @Test highlights the method exampleCheck():

@Test

public void exampleCheck(){

}

We then add some code inside the method, which is where our automation code lives (which we’ll learn about in the following section on UI drivers)

By following these simple steps, we have the ability to create the ‘shell’ of each automated check in which our code will be triggered. This enables us to organise our checks in a clean and consistent manner.

Test runners that we could use include:

Junit (Java)
Rspec (Ruby)
Unittest (Python)
Jest / Mocha (NodeJS)
Nunit (C#)

Interacting With A UI In Real Time: The UI Driver

Our runners enable us to create the ‘shell’ of an automated check, meaning whatever we put into the ‘shell’ will be executed. The runner doesn’t care if your code automates a browser, calls to an API, or is simply a unit of code.

So if we want to be able to drive the UI of a browser, we require a component that is responsible for opening a browser and allowing us to interact with its elements. This is the role of the UI driver.

A UI driver allows us to send instructions programmatically to a browser to enable us to do things like: click links, fill in forms, or detect if elements exist on a page, just as a few examples. Each driver works in their own way but the most common tool to use for driving the UI is Selenium-WebDriver, whose behaviour can be summarised in this diagram:

Model showing Java code connecting to a WebDriver that connects to a browser to describe the relationship between code and browser

We can use a tool like Selenium-WebDriver to declare which element we want to locate using findElement() and what we would like to do with that element using click(). When this code is triggered, it is sent to an instance of Driver, which translates our code into an instruction for a specific browser. The loop is then closed by the browser, feeding back through the driver that the action is complete.

The code below demonstrates how findElement() looks for an element that matches the CSS selector createRoom and asks if the element is displayed on the screen:

Boolean buttonExists = driver.findElement(By.cssSelector("#createRoom")).isDisplayed();

This is a powerful approach to driving the browser UI: rather than change our code for each browser type, which might be necessary otherwise, all we need to do is switch the Driver instance depending on the browser we want to use. The Driver instance will translate our code into the correct actions.

UI drivers that we could use include:

Selenium-WebDriver (across different languages)
Watir (Ruby)
Native JavaScript (NodeJS)

Did The Code Work As Expected? The Assertion Library

We’ve discussed package managers, runners, and UI drivers. With these components, we have the parts required to build a framework that interacts with our UI.

However, to complete our framework, we need some means to determine if the flow of our automation has been successful, resulting in a pass or fail that tells us if something has changed in a system. This is why we require a library to help us ‘assert’ success.

At some point near the end of our automated check, we’ll likely have extracted some data that allows us to determine if our expectations are confirmed or something has changed. Traditionally this is done with an assertion library that allows us to compare that expectation with the captured data. For example, in Java, we can do:

assertEquals(buttonExists, true);

This checks that the value of the variable buttonExists, which would have been extracted using our UI Driver, equals true. If it is true, the assertion will pass. If it doesn’t, it will fail.

Many runners come with their own assertion libraries built into the library. But some standalone assertion libraries we could use include:

Hamcrest (Java)
Chai (NodeJS)

Putting It All Together: Sample Code

This has been a quick exploration of some of the core components of a UI test automation framework. If you’d like to learn more about how these components might be arranged to make a framework, take a look at my Intro to UI framework on GitHub: https://github.com/mwinteringham/intro-to-ui/