🎉 TestBash Early Bird Ends May 31st — Book your team and your tickets today!

Leave Complexity Behind

No-Code Test Automation with Low-Code integration via Robot Framework and TestBench

Making Browser Automation Easy With Python and WebDriver

How To Use Python To Automate Almost Anything: Article 2 of 3

by Josh Grant
Oct 16, 2017
13 min read

Making Browser Automation Easy With Python and WebDriver image

Like Bookmark Add to collection

Content in review

In part I, I introduced Python as a good choice of programming language for testers who want to automate almost anything. I walked through the initial setup of Python, some possible libraries, the functionality that Python provides, and what a “Hello, World!” program looks like. Now let’s look at how Python would work for automating a “real world” testing task.

Before you begin, here are a few items you’ll need to have installed for this article:

Python - make sure Python is installed as described in the previous article for your OS (Mac, Windows or Linux)
Chrome - we’ll be using the Chrome browser for this article. If you haven’t installed Chrome on your computer, make sure you download it now.
ChromeDriver - We will need this browser driver to control Chrome from external scripts. It’s straightforward to download ChromeDriver and easy to install if you follow Google’s installation instructions.

The rest of the article assumes that Python, Chrome and Chromedriver are correctly installed and configured on the machine being used to run the below script.

Automated Capturing of Screenshots

Suppose you start working on a web-based project that requires inspecting several screenshots of an app to look for visual changes. Even simple web applications can have dozens of pages, and each one may have to be visited individually. This task may be conceptually simple but can be quite time-consuming. This is one place where automation can greatly help out.

One approach to solving this problem could be to use a script that iterates through a list of URLs, visits each one and takes a screenshot. If such a list can be produced, then this task can be automated in a relatively straightforward way. We can harness the power of the Selenium WebDriver to make automating browser tasks easy. We can use the WebDriver in a Python script to automate using a browser, which in turn allows us to automate taking screenshots.

There are other approaches we could take to try to acquire the screenshots. For example, we could try invoking several browsers at once, each one with a given URL, and then manually inspecting each one individually. The downside to this approach is that the inspection doesn’t produce any files or artifacts that can be reused later. This is a definite advantage to taking screenshots in an automated way by the aforementioned approach. Files containing screenshots can be shared by team members or saved for future purposes. Capturing screenshots also has the advantage of removing some manual steps that could be error-prone, such as opening or closing browsers prematurely. Due to these downsides, a script that automates visiting URLs and taking screenshots seems like a better approach to solve our problem.

Assumptions Of The Script

In most cases when solving a problem, you have to start somewhere. Overthinking and under-doing doesn’t work. When it comes to scripting, it is also easy to get carried away thinking about all kinds of functionality and capabilities that can be put into a script.

It’s often helpful to limit the scope of a script in order to get something working as a starting point, so we’ll take this approach here. We can make assumptions about the URLs we need to visit and the tasks to get a starting point nailed down.

Let’s list some of our assumptions:

Each URL in the list is correct and valid (i.e. points to a valid site).
The list is formatted such that there’s no blank entries or lines.
This list is found in a single file in the same place as the script.
It’s sufficient to view each site on one browser (no cross-browser testing required). In our case, we will use the Chrome browser.
Screenshots can be saved in the same directory as the script.
No authentication is required to view the pages.

As an example, these assumptions are fine but might not be true. Entries in a text-based list could be invalid or blank depending on how the list is created. There may also be some requirement to test against different browsers at some point. However, the above assumptions give us a good starting point for putting a script together.

Writing The Script

Here is a general description of how we’re going to capture screenshots using Python

Steps

Create a directory in the project called shots.

Get the entries of the list of URLs from a file.

Visit each URL in the list and take a screenshot.

Save the screenshot as a PNG file in shots.

Based on this, we are using our assumptions that the list of URLs contains only valid URLs so visiting each one will produce a valid screenshot. This also allows us to assume that there won’t be any errors in taking each screenshot.

A Place To Start

Start by opening your favourite text editor. Create a file and name it 'take_screenshots.py.' This will be the script we use to capture our screenshots. If you don’t have a favorite text editor already, try Sublime Text, GNU Emacs, Notepad++ or my personal favourite, Visual Studio Code.

Creating Your Test Directory

Let’s start with the first step above, creating a new directory called shots. This is the directory or folder that will contain the screenshot images. To do this in Python, we can add this line at the beginning of take_screenshots.py file:

import os

Importing a module is a typical way, in Python, to make use of some library or set of functions. Adding this import line allows us to make use of the os module, which is a Python library for working with operating system-level objects, such as files and directories. Since os is a standard Python module included with all major releases of Python, it can be imported using the first line in the code above. Python and the Python community have created many modules that contain functionality for all kinds of tasks. Libraries contain existing functions which allow you to write scripts quickly.

Creating The Directory

Now that we have the os module to help us out, we can use some of the functions in it for our task. We want to create a directory called shots/ in the project directory, using the lines:

if not os.path.exists('shots'): os.makedirs('shots')

This will create this directory, if it doesn’t already exist.

Checking Conditions

The line starting with an if is called a conditional. Here, it checks that if a particular condition is met and then executes code. In this case, we check that if the shots/ directory doesn’t exist, then we create it. If it does exist, then we don’t do anything at this step. The functions os.path.exists() and os.makedirs() checks to make sure if the file path exist. Both functions take a string as a path to a directory, whether it exists or not. You’ll need to include tabs after the conditional statements to denote what gets included in the if-statement.

It should also be noted that the intentional tab space indentation after the ‘if’ is a part of Python styling and syntax.

Create A Source File Filename List

At this point, we have created a shots/ directory. Now we can move on to the next step and create a list of URLs to iterate through. Since we’ve assumed these URLs are kept in a single file, like a file named url_list.txt, we can open the file and collect each URL in the file and place it into a Python list. The code to do this looks like this:

urls = [] with open('url_list.txt') as f: urls = f.readlines()

Let’s go through this carefully.

Creating An Empty List

To create an empty list called urls we add the first line in the example above to our take_screenshots file. An empty list in Python is denoted by the two square brackets: []. In our example, urls is initialized as an empty list. In Python, this is a typical way for defining a variable in a way that suggests how it should be used. It should be noted that lists in Python correspond to arrays in other languages, so a Python list can be thought of as an array.

Reading File Lists In Python

Next we want to get the URL list from the file. Since our file is well-formatted, we can open the file and read each line of the file into the urls list. Python offers a fairly clean way of doing this. The function open() is a built-in Python function that opens a file. This file can be used for reading, writing or appending. In our case we only want to read data in the file, so we don’t need to specify which case since by default open() opens a file for reading. We give this function the file name, and it opens the file.

Files that are opened need to eventually be closed. You can directly open and close files in Python, but it can also be easy to forget closing an open file. Accessing closed files can also lead to weird behaviours on a computer later on. Python helps avoid these edge cases using the with construct. By opening the file using the above syntax, the file is automatically closed after all the lines inside of the with block are executed. In our case, the url_list.txt file is opened and the file object is assigned to the variable f. Using f we can call f.readlines() to read each line into a list in one call. Each line will be assigned to an entry in a list. After this is called, the file is automatically closed.

We have now read our values from the URL list file into a list we can use in Python. Let’s see what we can do with the browser.

Using Selenium WebDriver And Creating The URL Script

The next step is to visit each URL in the list in some straightforward way. Since we can access each URL, we really need a way to open the browser and navigate to each URL so that a screenshot can be taken. For this we will use the renowned Selenium WebDriver.

Note: Double check that you have Python, Chrome and ChromeDriver installed as mentioned at the beginning of this article. Review installation steps for these as needed based on your operating system.

The WebDriver is a library that hooks into browsers and drives the browser like an end user would. You can do actions such as visiting a URL, send keypress and mouse click actions, or page refreshes. The WebDriver covers any actions that a human user may want to undertake. This makes it a good choice for our task of visiting a list of URLs and taking screenshots.

Using PIP

To start using the WebDriver in Python, we can install the Selenium module using pip, Python’s package manager. Pip comes with standard installations of Python and should be available on your system if Python is installed. To install the Selenium module in Python, open a terminal and type

pip install -U selenium

Importing WebDriver

This command will download and install everything required to use the Selenium WebDriver with Python. The -U option will automatically update Selenium if an older version of package has already been installed. Once this command installs Selenium, we can use Selenium utilities in our script by writing our imports like this:

import os from selenium import webdriver

In the above import statements, we import all the utilities found in the os module, so we use the import os statement. We can also import particular utilities from modules. Since the selenium module contains several modules and functions that we don’t need for our script, we can import only the webdriver module by using the from selenium import webdriver statement.

Now we can access the functionality provided by the WebDriver.

ChromeDriver Sessions

First, the browser should be opened up and a new session created. This session will be what is used to drive the browser so we can visit each URL. To create a new session, we need to add the following line to our script:

driver = webdriver.Chrome(‘/path/to/chromedriver/’)

This line opens a new instance of Chrome to a blank page. This is a real instance of Chrome that works identically to an instance of Chrome invoked by the user. We can now control this browser using Python like any other tool or library using the driver object.

Constructing A For Loop

Since we have a list of URLs, we can use a for-loop construct to iterate through each entry in the list and perform an action. This is a pretty common looping construct in Python since lists are easy to create and manipulate. In Python, we can create a for-loop to get each item in order like this:

for url in urls:

This creates a block that gets each item in the urls list and assigns each one to a local variable called url. Note the colon at the end of this line, which is required Python syntax for loops.

Using each entry in the urls list, we can then do the following:

for url in urls: url = url.strip() driver.get(url) driver.save_screenshot('shots/screenshot.png')

Let’s walk through what each line of code is doing above.

For Loop Walk-Through

As previously mentioned, the first line sets up the for-loop for iterating through the list. This gives us the url variable that we can use within the loop, containing a string, representing each URL in the list.

Using The Strip Method

Since the list came from a file, at the end of each url string is a newline character that is invisible to users but needed by the file to denote where lines should break. It’s common to have to remove or strip this newline character from the end of a string in many programming contexts. Python provides a built-in function to accomplish this called strip() that can be called on a string. This means url only contains exactly the URL we want to visit and no other invisible characters that may cause problems when navigating the browser.

Navigating To The URL Via The Browser

Next we navigate our browser to the URL. Since this is a fairly core browser function, the WebDriver provides an easy way to do this with the get() function. Passing the url string into get() function looks like this:

driver.get (url)

This line tells the WebDriver to go to that URL exactly the same way a human user would. Based on our previous assumption that each URL in the list is valid, the WebDriver will then visit this page and receive the data from that page in the expected way. It’s a straightforward step but it’s also key to the utility of writing our automated script.

Taking The Screenshot

The WebDriver also provides the ability take a screenshot of a web page at a given point in time. Calling the function does exactly this:

driver.save_screenshot('shots/screenshot.png')

The current contents of the visited URL are taken in a screenshot and saved to a file, which in this case is called screenshot.png in the shots/ directory. There are a few file formats that the WebDriver supports for screenshot files, but PNG is a common one, so we’ll select that one here.

Iterating The Screenshot Name

We’ve now written the main body of this script, which runs through the list of URLs and takes screenshots of each one. However, there’s one small problem: because each screenshot is being saved to the same file location of shots/screenshot.png, each URL screenshot overwrites the previous one and we only end up with a screenshot for the file URL in the list. Whoops!

This issue can be remedied in a few different ways but one technique to solve problems is to ask: “What’s the simplest possible thing that could work?” In this case, having each screenshot file numbered from 1 to as many URLs that are in the list is an easy solution. Let’s create a counter to keep track of how many URLs have been visited which can be used in the file name.

Creating a numeric variable in Python is simple. Before the loop add this line

count = 1

This creates a variable called 'count' that we can use in the rest of the script. Using this as a counter that gets incremented with each URL could look like this:

count = 1 for url in urls: url = url.strip() driver.get(url) driver.save_screenshot('shots/screenshot{}.png'.format(count)) count = count + 1

Let’s look at the last two lines of the above code.

Script Improvements

Instead of saving each screenshot to the same file, we’re now creating a different screenshot filename for each URL that includes the counter:

'shots/screenshot{}.png'.format(count)

The string value is how Python inserts certain values like numbers into a string based on variables. So if count is 1, then the resulting filename becomes shots/screenshot1.png. The function format() does the work of inserting the value of count into our screenshot name.

We also need to update count on each iteration of the for-loop. To do this, we can write

count = count + 1

which is a fairly conventional way for Python, and a lot of other languages to increment a numeric value. This line takes the value of count, adds 1 to it, and sets the value of count to the newly incremented value.

We’ve now found a way to create separate screenshot files for each screenshot. Easy peasy, thanks to the WebDriver!

Closing The Browser

The last thing we need to do is close the browser session. Again, since this is a pretty standard operation the WebDriver provides a simple function for doing this, which looks like this:

driver.close()

This closes the browser session and the browser itself, freeing up whatever resources the browser used. We now have our script.

As Simple As That

And that’s it! Running this script should now generate screenshot files in the shots/ directory

Here is a completed version of this script:

import os from selenium import webdriver if not os.path.exists('shots'): os.makedirs('shots') with open('url_list.txt') as f: urls = f.readlines() driver = webdriver.Chrome(‘/path/to/chromedriver/’) count = 1 for url in urls: url = url.strip() driver.get(url) driver.save_screenshot('shots/screenshot{}.png'.format(count)) count = count + 1 driver.close()

Running The Script

Now that we’ve written a draft of this script, let’s run it. This is pretty easy. If you open up a command prompt or terminal, you can execute the following commands:

cd /path/to/the/script/ python take_screenshots.py

That’s it! This script should run and take only a few seconds at the most. After the script runs, you should have a shots/ directory created in the same place as the script containing numbered files with a screenshot of each URL.

Next Steps

We’ve now seen a bit of what Python can do for a tester. We’ve written a simple but effective script to help capture screenshots from a list of URLs. This script can be used whether there’s ten URLs in the list, or ten thousand.

Remember, we did make some assumptions to make our script work. We can modify our script to relax these and add other assumptions to make it more robust and reusable.

In the next part of this series, we’ll see how we can update this script to make it more reusable, such as allowing the script to set the location of the list of URLs. We’ll also see what kind of utilities Python provides for making scripts like this easier to use and more maintainable.

< Prev Next >

References

Josh Grant

Josh Grant is interested in automation, software development and big ideas. He’s currently working as a test developer doing cool automation work in Toronto, Canada. You can find him active on Twitter.

Leave Complexity Behind

No-Code Test Automation with Low-Code integration via Robot Framework and TestBench

Explore MoT

Leading with Quality

Tue, 30 Sep

A one-day educational experience to help business lead with expanding quality engineering and testing practices.

MoT Foundation Certificate in Test Automation

Unlock the essential skills to transition into Test Automation through interactive, community-driven learning, backed by industry expertise

Leading with Quality

A one-day educational experience to help business lead with expanding quality engineering and testing practices.

This Week in Testing

Debrief the week in Testing via a community radio show hosted by Simon Tomes and members of the community