Building better tests with AI

MoT Berlin

Ruslan Strazhnyk

Founder/CEO

15 Jun 2026

Locked

Every QA tool vendor will tell you AI generates perfect tests. After building an AI test generation platform - and dogfooding it on our own codebase - I can tell you what actually happens.

This talk is a practitioner’s honest debrief. I’ll walk through two years of running multi-model AI against real web apps: what produces usable tests, what produces confident-looking garbage, and where the failure modes hide.

Specifically, I’ll cover:

Why reading code isn’t enough - AI generates plausible tests from source, but they fail on real UIs. Crawling the live app changes everything.
The selector problem - LLMs reach for brittle CSS selectors by default. How to force better strategies without prompt-engineering every call.
Assertions that rot - AI loves asserting exact text and prices. Why your generated suite breaks on the first content change, and how to catch it before CI does.
Multi-model routing - no single model wins at everything. What we learned running GPT-4o, Claude, and Gemini on the same flows.
Self-healing in practice - the gap between “it healed” and “it healed correctly.”