work-blog/testing-probabilistic-systems.md at 544b773e8f9b5227a0190b0fe4cd85218c0495c1

Gregory Gauthier 544b773e8f feat(drafts): add initial drafts for philosophy-inspired testing articles

Introduces nine new draft articles exploring intersections of software testing with philosophy, epistemology, and related concepts:
- On Flakiness (Heraclitus and non-deterministic tests)
- Popper and the Risky Test (demarcation criterion)
- Regression as Institutional Memory (Wittgenstein's On Certainty)
- Tacit Knowledge and the Testing Checklist (Polanyi's tacit dimension)
- Test Environments as Platonic Shadows (Plato's cave allegory)
- The Tester as Witness (legal metaphor and testimony)
- Testing Probabilistic Systems (ML and statistical testing)
- The Oracle Problem (oracles in testing frameworks)
- When Quality Becomes Quantity (Goodhart's Law and metrics)

2026-04-20 09:28:28 +01:00

739 B

Raw Blame History

Testing Probabilistic Systems. LiverMultiScan has ML components; cardiac T1 mapping produces distributions not binaries. The testing pyramid was built for deterministic, functional code — it breaks on probabilistic systems, where "correctness" is a statistical property, not a per-invocation one. This is a natural sequel to Testing Telos: none of your four shapes quite fits ML. Google's "ML Test Score" paper[1] and Christian Kästner's "Machine Learning in Production"[2] are good starting points. This is also where your concern about LLMs and your day job most obviously meet.

[1] https://research.google/pubs/the-ml-test-score-a-rubric-for-ml-production-readiness-and-technical-debt-reduction/ [2] https://ckaestne.github.io/seai/

739 B Raw Blame History

739 B

Raw Blame History