| title |
date |
topics |
related |
abstract |
| Testing Probabilistic Systems |
2026-04-20 |
|
|
The testing pyramid was built for deterministic, functional code, and it breaks on probabilistic systems where "correctness" is a statistical property rather than a per-invocation one. ML components and signal-producing pipelines demand a different shape of test — and the usual telos-shaped diagrams do not quite accommodate them.
|
Testing Probabilistic Systems. LiverMultiScan has ML components; cardiac T1 mapping produces distributions not binaries. The testing pyramid was built for deterministic, functional code — it breaks on probabilistic systems, where "correctness" is a statistical property, not a per-invocation one. This is a natural sequel to Testing Telos: none of your four shapes quite fits ML. Google's "ML Test Score" paper1 and Christian Kästner's "Machine Learning in Production"2 are good starting points. This is also where your concern about LLMs and your day job most obviously meet.