Baking Quality into Your Data Pipeline - Ali Khalid
19 Nov 2020
-
Locked
Ensuring data quality is identified as one of the most challenging issues in Big Data. This starts with identifying scope of the Data Pipeline at each junction. Next is to pick the appropriate data quality dimensions relevant to business criticality and build automated checks providing insights into the quality of data.Â
The session is designed to give participants an introduction to how a big data project is structured, how data flows, what quality checks are generally used and how to automate them. The main sections in the talk are:
- Difference between Big data and conventional data usage
- Sample technology stack for a big data project
- Introduction to a data pipeline
- The kind of tests and automation needed
- Data quality dimensions (Why is data quality important, 6 dimensions explanation along with demo how to test them in our sample pipeline)
- Automating data quality checks
SmartBear's vision for using AI agents to support application integrity and software quality
Explore MoT
Thu, 1 Oct
A tech conference to help you navigate the ever-shifting landscape of Quality Engineering, AI, Leadership, Product, Accessibility and Security.
Boost your career in software testing with the MoT Software Testing Essentials Certificate. Learn essential skills, from basic testing techniques to advanced risk analysis, crafted by industry experts.
Debrief the week in Quality via a community radio show hosted by Simon Tomes and members of the community
Comments