Bob Salmon
Tech Lead
I am Open to Mentor, Speak, Write, Podcasting
I'm a programmer who likes people, code and data. That means I think that things like quality and user experience are important too.
Achievements
Certificates
Level up your software testing and quality engineering skills with the credibility of a Ministry of Testing certification.
Activity
earned:
Spoke at MoT Cambridge
earned:
Schema (Database Schema)
achieved:
This badge is awarded to members who join a Ministry of Testing Chapter.
is Open to Podcasting
is Open to Write
Contributions
Learn how focusing on user value and trust gives you a clearer, more effective way to test data quality
A data contract is a document that defines the ownership, structure, semantics, quality, and terms of use for exchanging data between a data producer and their consumers. It is human- and machine-readable, and so can be used as both a communication tool between teams and a way to automatically detect when expectations about data are broken.
Write-Audit-Publish (WAP) is a pattern for designing data pipelines where a pipeline is built up of several sections. Each section produces a result data set that is used by one or more downstream sections and conforms to the same three-stage process:
Write: The main work of the section is done and data is written to a staging area that is inaccessible to other sections
Audit: The staging data is checked using automated checks
Publish: Only data that passes the audit is published to downstream sections of the pipeline
Medallion data architecture is a way of splitting the ingestion and processing of data into three stages: bronze, silver, and gold.
In the bronze stage, ingested data is stored in its unaltered form.
The silver stage attempts to fix problems in the bronze data and augment it by linking it with other data, producing a more usable version.
The gold stage takes the silver data and summarises it, along with any other processing needed to make the data ready for consumption by downstream processes.
YAML is a way of expressing structured data that is both machine-readable and human-readable, similar to JSON. It is often used for configuration and contracts that need to be understood and maintained by people as well as systems.
Data quality is weird - Bob Salmon
The definition schema given by Emily O'Connor is the most common meaning of the word schema in the context of databases, including relational databases.There is another meaning for schema, also in the context of relational databases, which can be confusing. In this other meaning, a schema is a container of database tables, views etc., and so acts as a way of sub-dividing the things in a database instance. A database instance is something with a connection string, that you can log into. Within that will be one or more schemas, each of which will contain zero or more tables etc. On Microsoft SQL Server databases the default schema is dbo, which is short for database owner.There are a few reasons why you might choose to have more than one schema in a database:
Schemas can divide tables up into groups of related tables, e.g. relating to different features of the associated software. This is particularly useful when the database grows to have many tables. The advantages of using schemas rather than separate databases include the fact that transactions and foreign key relationships can reach from one schema to another, but can't (easily) span from one database to another.
Permissions can be granted at the schema level, so that e.g. a user can be given read access to all tables in one schema but not to a different schema in the same database instance.
Database table names must be unique within a schema, but you can reuse a table name if each table with that name is in a different schema.
In SQL Server, the full version of a database table's name (or other database object such as a view) has four parts:Server.Database.DatabaseSchema.DatabaseTableOther databases, such as Postgres, can have different full name formats but will usually include the schema name.
Server is the database server, i.e. one running instance of the database software, and not the physical or virtual machine it runs on.
Database is the database instance - one database server can host one or more database instance. Each database instance has a separate connection string.
Database schema e.g. Finance
Database table e.g. MonthlyPaymentSummary
Bridge the gap between developers, testers, and data teams to create stronger, people-centred quality practices.
Tasked with reviewing a large requirements document? Bob Salmon has you covered with his handy tips for reviewing requirements documents
Bob Salmon solves the Data Management Challenge from Test.bash(); 2022 using SQLServer Studio and Visual Studio.