Reading:
Lessons learned in test-driven development: Software tester edition
RiskStorming image
An educational tool to explore Risk Analysis and Quality Strategy building with the whole team.

Lessons learned in test-driven development: Software tester edition

Identify when TDD, traditional, or hybrid testing best fits your environment to deliver high-quality software

Lessons learned in test-driven development: Software tester edition image

When I began my career as a test engineer about a decade ago, fresh out of school, I was not aware of formal approaches to testing. Then, as I worked with developers on teams of various sizes, I learned about several different approaches, includingĀ test-driven development (TDD).Ā 

I hope to share some insights into when I found it was good to implement TDD. I'll also share my experience on when traditional testing approaches or a hybrid approach may be a better approach than TDD on its own.

A great experience with test-driven development

First impressions

At first, TDD seemed counterintuitive to me—a reverse of the traditional approach of writing code first and testing later.

One of my first development teams was pretty small and flexible. So I suggested that we give TDD a try. Right off the bat, I could see that we could adopt TDD, thanks to the team's willingness to engage in supportive practices.Ā 

Advantages for test planning and strategy

The team engaged in test planning and test strategy early in the release cycle.

We discussed in detail potential positive and negative test cases that could come out of a feature. Each test case included expected behavior from the feature when exercised, and the potential value of the test. For us testers, this was a nice gateway to drive development design early by bringing team members to discussions upfront.

This sort of planning also facilitated theĀ Red-Green-Refactor TDD concept, which in TDD is:

  • Red: To write a failing test that defines a desired behavior
  • Green: To write just enough code to make the test pass
  • Refactor: To improve the code while keeping all tests passing

Time and clarity

We had the time and clarity to engage thoughtfully with the design process instead of rushing to implementation. Writing tests upfront helped surface design questions early, creating a natural pause for discussion, before any major code was finalized.Ā 

This shifted the tone of the project from reactive to responsive. We were not simply reacting to last-minute feature changes; instead, we actively shaped the system with clear, testable outcomes in mind.

Solid documentation helps

TDD encourages the documentation of code with expected behaviors. So we had comprehensive internal and external user-level documentation, not just an API spec. Developers linked their code examples against such tests. The internal documentation for features was very detailed and explanatory, and was updated regularly.

Opportunities for healthy collaboration

TDD requires healthy collaboration, and our team enthusiastically interacted and discussed important issues, fostering a shared understanding of design and quality objectives. We were able to share the workload, especially when the technical understanding was sound amongst members of our team. The developers did NOT have an attitude of ā€œI type all the code and testers can take the time to test later." Quite the contrary.Ā 

Challenges of test-driven development in high-pressure environments

Fast forward to my experience at my current job at a FAANG company. Here, the focus is responding to competition and delivering marketable products fast.Ā 

In this environment, I have observed that although TDD as a concept could have been incorporated, it did present several challenges:

Feature churn and speed hinders TDD adoption

The feature churn in our team is indeed very fast. People are pushed to get features moving. Developers openly resisted the adoption of TDD: working with testers on test-driven feature design was perceived as "slowing down" development. The effort-to-value ratio was questioned by the team. Instead, developers simply write a few unit tests to validate their changes before they merge them. This keeps the pipeline moving quickly.Ā 

As it turned out, about 80 percent of the product's features could in fact be tested after the feature was developed, and this was considered sufficient.

Features in flux and volatile requirements

One challenge we faced with TDD was when feature requirements changed mid-development.Ā 

Initially, one of the designs assumed all clients would use a specific execution environment for machine learning models in the team. But midway through development, stakeholders asked us to support a newer environment for certain clients, while preserving the legacy behavior for others.Ā  Since we had written tests early based on the original assumption, many of them became outdated and had to be rewritten to handle both cases.Ā 

This made it clear that while TDD encourages clarity early on, it can also require substantial test refactoring when assumptions shift.

Our models also relied on artifacts outside of the model, such as weights and other pre-processing data. (Weights are the core parameters that a machine learning model learns during its training process.) These details became clear only after the team strove for ever-higher efficiency over the course of the release. The resulting fluctuations made it difficult to go back and update behavioral tests.

While the issue of frequent updates is not unique to TDD, it is amplified here, and it requires an iterative process to work. The development team was not in favor of creating behavioral tests with volatile projects only to have to go back and rework them later.

In general, TDD is better for stable code blocks. The early authoring of tests in the situation I've described did not appear to be as beneficial as authoring tests on the fly. Hence, the traditional code-then-test approach was chosen.

Frequent changes due to complex dependencies

With several dependencies over multiple layers of the software stack, this made it difficult to pursue meaningful test design consistently. I noticed that not all teams whose work was cross-dependent communicated clearly and well. And so we caught defects mostly during full system tests. 

Tests for machine learning features required mocking or simulating dependencies, such as devices, datasets, or external APIs. These dependency changes over the feature development made the test a bit shaky. Mocking code with underlying dependencies could lead to fragile tests. So, in our case, TDD appeared to work best for modular and isolated units of code.

Integration testing demands

TDD largely focuses on unit tests, which may not adequately cover integration and system-level behavior, leading to gaps in overall test coverage. It can get too tightly coupled with the implementation details rather than focusing on the broader behavior or business logic.

Many teams relied on us testers as the assessors of the overall state of product quality, since we were high up in the stack. The demand for regulated integration testing took up a big chunk of the team's energy and time.

We had to present results to our sponsors and stakeholders every few weeks, since the focus was on overall stack quality. Developers across teams also largely looked to the results of our integration test suite to catch bugs they might have introduced. It was mainly through our wide system coverage that multiple regressions were caught across the stack across hardware, and action was taken.

Developers did not fully understand TDD processes

Though developers did author unit-level tests, they wrote their code first, the traditional way. The time to learn and use TDD effectively was seen as an obstacle, and developers were understandably reluctant to risk valuable time.

When developers are unfamiliar with TDD, they may misunderstand its core process of Red-Green-Refactor. Skipping or incorrectly implementing any stage can lead to ineffective tests. And this was the case with our team. Instead of creating tests that defined expected outcomes for certain edge cases, the attempts focused heavily on overly simplistic scenarios that did not cover real-world data issues.

Balancing TDD and traditional testing

In situations like my company's FAANG product, it does seem natural and obvious to fall back to the traditional testing approach of coding first and then testing. While this is a pragmatic approach, it has its challenges. For example, the testing schedule and matrix have to be closely aligned with the feature churn to ensure issues are caught right away, in development … not by the customer in production.

So, is it possible to achieve the best of both worlds?

The answer, as with any computer science-related question, is that it depends. But I say it is possible, depending on how closely you work with the engineers on your team and what the team culture is.

Though TDD might not give you a quality coverage sign-off, it does help to think from a user’s perspective and start from the ground up.

Start planning and talking early in the process

During the initial stages of feature planning, discussions around TDD principles can significantly influence design quality. This requires strong collaboration between developers and testers. There has to be a cooperative mindset and a willingness to explore the practice effectively.

Leverage hybrid approaches

TDD works well for unit tests, offering clarity and precision during the

development phase. Writing tests before code forces developers to clarify edge cases and expected outcomes early.

TDD appears to be a better suite for stable, modular components. TDD can also help to test interactions between dependencies versus testing components that are independent.

Meanwhile, traditional pipelines are better suited for comprehensive system-level testing. One could delay writing tests for volatile or experimental features until requirements stabilize.

Recognize the value of traditional integration testing pipelines

As release deadlines approach, traditional testing methods become critical. Establishing nightly, weekly, and monthly pipelines—spanning unit, system, and integration testing—provides a robust safety net. There is a lot of churn, which requires a close watch to catch regressions in the system and their impact. Especially during a code freeze, traditional integration testing is the final line of defense.

Automate testing as much as possible

I have found it indispensable to design and use automated system-level tools to make sign-off on projects easier. These tools can leverage artificial intelligence (AI) as well. Traditional testing is usually a bottleneck when tests are combinatorially explosive over models and hardware, but with the advent of generative AI, test case generation can help here, with a pinch of salt.Ā 

A lot of my test tooling is based on code ideas obtained using AI.

AI-based TDD is picking up steam, but we are still not close to a reliable, widespreadĀ Artificial Generative Intelligence (AGI) use for testing.

To wrap up

For testers navigating the balance between TDD and traditional testing, the key takeaway is quite simple: Adapt the principles to fit your team’s workflow, do not hesitate to try out new things and understand the experience, and never underestimate the power of early behavioral testing (TDD) in delivering high-quality software.

For more information

Senior Software Engineer
I work as a Senior Software Test Engineer in Machine learning and design tools and infrastructure for testing and product and quality sign-off. I lead various projects for quality testing in the team.
Comments
RiskStorming image
An educational tool to explore Risk Analysis and Quality Strategy building with the whole team.
Explore MoT
This Week in Testing image
Debrief the week in testing via a radio show
Improving Your Testing Through Operability image
Gain the tools you need to become an operability advocate. Making your testing even more awesome along the way!
Leading with Quality
A one-day educational experience to help business lead with expanding quality engineering and testing practices.
This Week in Testing image
Debrief the week in Testing via a community radio show hosted by Simon Tomes and members of the community
Subscribe to our newsletter
We'll keep you up to date on all the testing trends.