🎉 TestBash Early Bird Ends May 31st — Book your team and your tickets today!

Quality Statements for LLMs: The Good, The Bad and The Ugly

2nd October 2024

Locked

Bastian Knerr

Teamlead Testing

Quality Statements for LLMs: The Good, The Bad and The Ugly image

Like Bookmark Add to collection

Talk Description

AI as a buzzword is everywhere. It will steal our jobs, make us all obsolete and in the end: It will rule the world. We've been experiencing a shift in paradigms for two years and, most prominently, Large Language Models like LLaMA, ChatGPT or BARD are re-shaping industries and our everyday lives.

Using a Co-Pilot for Coding or Testing is seen as enhancing production and lowering barriers to entry.
But now that the uses of these LLMs are increasing rapidly:

Who is testing them?
And what actually is Quality in the age of AI?

In this talk, I want to provide results from my experience in projects of testing Large Language Models and regressive AI. I will explain the high-level function of a Large Language Model.

I will translate the components of a Copilot onto a newly thought testing pyramid from the component level to the system level. Now that we have a sort of framework to test LLMs, I will outline the metrics used and why testers will still be needed in the age of AI - maybe even more than ever.

By the end of this session, you'll be able to:

Learn how a Large Language Model works on a high level and possible pitfalls for testing
Discover a high-level standardized approach to testing Large Language Models
Understand a new testing pyramid: What's the component level in LLM systems?
What is quality in the age of AI? What metrics can we use - and how contextual are they?
Understand the importance of a tester's perspective and why testers will still be important going forward

Bastian Knerr

Teamlead Testing

From Accounting and Controlling to Software Testing. I love reading and jogging. Love philosophy. Occasional rallye co-driver. Everywhere I go, music needs to be with me.

Reduce flakiness. Try Squish for free.

Enhance test coverage, and streamline automation. Take a tour!

Explore MoT

Leading with Quality

Tue, 30 Sep

A one-day educational experience to help business lead with expanding quality engineering and testing practices.

MoT Software Testing Essentials Certificate

Boost your career in software testing with the MoT Software Testing Essentials Certificate. Learn essential skills, from basic testing techniques to advanced risk analysis, crafted by industry experts.

Leading with Quality

A one-day educational experience to help business lead with expanding quality engineering and testing practices.

This Week in Testing

Debrief the week in Testing via a community radio show hosted by Simon Tomes and members of the community

Quality Statements for LLMs: The Good, The Bad and The Ugly

Bastian Knerr

Talk Description

Suggested Content

Reduce flakiness. Try Squish for free.

Leading with Quality

Watch the full video