🎟️ RSVP: Agentic QE: From AI Assistant to AI Workforce 🎟️

A tester’s role in evaluating and observing AI systems

17 Oct 2025

Locked

Carlos Kidman

Senior Quality Architect

A tester’s role in evaluating and observing AI systems thumbnail

Thank

Bookmark

Add to collection

Talk Description

As more teams build products powered by AI models, testers have a growing opportunity to shape how these systems are evaluated and understood. In this talk, Carlos Kidman shows how testers can apply familiar testing skills to the world of AI, using LangSmith to create manual and automated evaluations, define quality attributes, and observe how AI behaves in development and production.

Through live examples, Carlos demonstrates how to design meaningful tests for non-deterministic systems, measure performance and accuracy, and add value to AI projects from design to deployment. You’ll see that many of the skills testers already have, such as analysis, evaluation, experimentation, and observation, translate directly into testing AI.

By the end of this session, you'll be able to:

Describe a tester’s role in evaluating and observing AI systems.
Explain how to design and run manual and automated tests for AI models using LangSmith.
Identify useful metrics and evaluation techniques for assessing AI quality.
Apply observability tools to monitor AI performance and behaviour in production.

Carlos Kidman

Senior Quality Architect

He/Him

Carlos is a Senior Quality Architect and AI Engineer. He is the founder of QA at the Point, but is best known for his hands-on courses and presentations on using AI and testing AI systems.

Explore MoT

Don’t automate everything, review everything

Thu, 26 Mar

Software Testing Live: Episode 06

MoT Software Quality Engineering Certificate

Boost your career in quality engineering with the MoT Software Quality Engineering Certificate.

19 Nov 25

Certification

Into The Motaverse

Into the MoTaverse is a podcast by Ministry of Testing, hosted by Rosie Sherry, exploring the people, insights, and systems shaping quality in modern software teams.

A tester’s role in evaluating and observing AI systems

Talk Description

Suggested Content

Meet Maestro Studio, free desktop app for UI tests

Don’t automate everything, review everything

Watch the full video