Data Bias

Data Bias image
A systematic skew in a dataset that causes a model trained on it to produce outputs that are consistently inaccurate, unfair, or unrepresentative for certain inputs, groups, or contexts. Data bias can originate from how data was collected, labelled, filtered, or weighted and is often invisible until the model is tested across a broad range of conditions.

So what? Data bias is one of the most consequential quality risks in AI systems because it is baked in before a line of application code is written. Testing for it requires deliberate coverage of underrepresented groups, edge cases, and real-world distributions, not just happy-path inputs.

Examples: A hiring tool trained predominantly on CVs from male candidates learns to downrank applications from women, not because of an explicit rule but because of patterns in the training data. An image recognition model trained on photographs taken in high-income countries performs poorly on images from lower-income contexts where lighting conditions, camera quality, and subject framing differ.
Explore MoT
MoTaCon 2026 image
Thu, 1 Oct
A tech conference to help you navigate the ever-shifting landscape of Quality Engineering, AI, Leadership, Product, Accessibility and Security.
Prompting for Testers image
Unleash the power of generative AI to boost your software testing and day-to-day tech tasks
Into The Motaverse image
Into the MoTaverse is a podcast by Ministry of Testing, hosted by Rosie Sherry, exploring the people, insights, and systems shaping quality in modern software teams.
Subscribe to our newsletter