The problem: Debugging builds in a fast-moving world
Modern software changes quickly. Developers are constantly adding new features and fixing bugs, leading to frequent updates. With all these updates, sometimes things break, and finding out what caused the problem can be a time-consuming headache.
Let us imagine your team discovers a bug in a recent version of your software. Somewhere along the way, something went wrong. But how do you figure out when it started breaking?
You could test each build one by one, starting from the last known good version, until you find the first bad one. This is the straightforward approach and something that would work. But it could take hours or even days, especially if the builds happen frequently. Testing each one takes time.
This article introduces a faster method for pinpointing the exact version where issues began, utilising a more efficient search approach that leverages parallel testing across multiple devices.
Why debugging at scale is difficult
Here is the challenge in simple terms:
- Too many versions: You may have 50 or 100 versions between the last good version and the one where the bug shows up.
- Running full test sets is slow: Running a full set of tests can take a long time.
- Limited resources: You might only have a few machines or devices on which to run these tests.
- Pressure to fix bugs fast: Engineers need quick answers to stay productive and keep release timelines on track.
Traditionally, testers would slowly go through each version to find out when the bug first appeared. Our team needed a faster approach.
The real-world need
When a bug shows up in testing, the first question developers ask is:
“Can you tell us the first version where this problem started?”
This is harder to determine than it sounds. But I tried to answer this more efficiently by using a technique I call a 5-point parallel search, especially given the access to multiple resources.
The faster approach: 5-point parallel search
Imagine that you are flipping through a photo album to find the first picture where someone is missing. Instead of looking at one photo at a time, what if you picked five pictures spread out across the album and looked at them all at once?
If the person is missing in the last few photos, but they are present in the earlier ones, you’ve already begun to narrow down when they disappeared. Now you need only to use a smaller sample size each time. That’s the idea behind the approach.
Step by stepÂ
-
Find your starting points
- Choose two builds:
- C: The last good version where everything worked
- Y: the version where the bug was first noticed
- Choose two builds:
-
Pick five builds between them:
- The builds you choose should be evenly spaced in time between the last known good build and the latest one, like markers along a timeline.
- Let’s call the builds X1, X2, X3, X4, and X5.
-
Run tests on all five builds at the same time:
- Use as many devices as you have available.
- Cache or save results to avoid repeating any tests.
-
Wait and watch:
- As results come in, identify:
- Earliest failing build (EFB): the first version where the bug shows up
- Latest passing build (LPB): the last version where things still worked
- As results come in, identify:
-
Narrow down the search:
- If a version fails, look at earlier ones.
- If a version passes, look at later ones.
- Repeat the process in the new, smaller range.
- Stop when you zoom in on the exact version: Eventually, you’ll find the first version where the bug appears, usually after just a few rounds.
Why 5-point parallel search is faster
This method reduces the number of test runs needed:
- Old way: Might take 50+ tests, one at a time.
- New way: Usually finds the problem in just three to five rounds, thanks to parallel testing and smarter searching.
Real Results
In practice, I saw:
- 70 to 80 percent faster debugging time.
- Developers and testers could share information sooner and take action quickly.
- Bug reports were clearer and based on specific code changes.
What to keep in mind
This method works best when:
- You know the last good and first bad versions.
- Tests are consistent. If they randomly fail or pass, it is harder to narrow things down.
- You avoid repeating test runs by caching results.
- You have access to multiple test devices or you can simulate parallel test runs.
Limitations on effectiveness
- Some tests start late because of setup time.
- If you have very few test devices, the benefit of parallel testing goes down.
- This method works like a magnifying glass; it is most useful when you have a known failure window.
Conclusion: Faster answers, happier teams
The 5-point parallel search method improves debugging efficiency by:
- Using parallel test execution across multiple builds
- Reducing the search space faster than searching through builds one by oneÂ
- Allowing you to proceed with your search as soon as the EFB or LPB is detected instead of having to wait for test runs on all builds to finish
By using a smarter search strategy and running tests in parallel, you can save time, reduce frustration, and help your team move faster. This approach doesn’t require advanced math or complex tools, just a clear plan, some test devices, and the willingness to try a better way of doing things.
The next time someone asks “When did this bug show up?” you could be ready with an answer in hours instead of days.