work-blog/the-categories-of-testing.md at f9a3674abdea02d14f1d886ac693db3abede5bbe

Gregory Gauthier da44ea30f1 refactor(structure): reorganize articles and assets directories

Move drafts to articles/drafts, articles to articles/published, and assets to general and memes subdirs.

2026-04-07 15:18:51 +01:00

15 KiB

Raw Blame History

I’ve already discussed in a previous post, albeit to an incomplete degree, what a tester is. In this post, I’ll be addressing perhaps an even more important question. Namely: what is a test? The answer is hinted in yet another previous post, where I asserted that a tester is fundamentally a “fact finder”. From that label, one could infer a very low-resolution definition: if a tester is a fact finder, then a test is the process by which a finder goes about finding his facts.

This answer is not wrong, but it’s also not very informative. There are lots of facts to find in the world. My precise height and weight. The date of the Whiskey Rebellion. Where the data for my project is stored. The property of transparency in glass. What time you arrived at work this morning. And so on. Every one of these facts entails a different process by which they are discovered. Does that make every fact-finding effort a test? Does it make every fact an artefact of a testing activity?

In a word, no. There are a wide variety of ways in which we come to discover facts, and not all of those activities are “tests”. Even within academic disciplines this is not the case. A historian, a mathematician, a scientist, and a theologian all employ different techniques for the discovery of facts (and, indeed, have very different uses of the word “fact”, but that’s a story for another time). So, what are we talking about, then?

The Role Of The Tester

A tester is a fact-finder in a specific context, and of a specific kind. Namely, a tester is only interested in facts about the project on which he works – and his discovery processes will be those best suited to provide him with the facts relevant to the aspect of the project under scrutiny.

Thus, the tester is a finder of facts about the application he is testing, facts about the people who interact with the application, and facts about the relationship between these two poles. He uses testing practices appropriate to each of these three questions, to discover the relevant facts.

But, just as there are many different kinds of facts about the world and many different kinds of disciplines that discover those facts, there are also many different kinds of facts about your software project, and many different kinds of testing needed to discover those facts.

Testing Categories

Any cursory internet search will yield dozens of pages that will give you a variety of lists of “kinds of tests”. But most of these lists don’t do a very good job of defining the kinds, choosing instead to describe the practices and outcomes in lieu of a definition. Others will offer two very broad (and again, not very informative) categories of “functional” and “non-functional”, where everything from units to user experiences counts as “functional”, and everything from configuration to compliance counts as “non-functional”.

Instead, as I’ve already telegraphed above, I think it makes more sense to group kinds of tests by the kinds of facts that we’re interested in, not by the kinds of activities we’re engaged in to discover those facts. To put it another way: what matters is not so much how we gather the information, but what the information is for. When viewed from this vantage point, there are essentially three kinds of facts we’re interested in:

Facts about the behaviour of the the application under test
Facts about the behaviour of the human beings interacting with the application under test
Facts about the relationship between the two

The facts gathered in each of these three categories enables us to make decisions about the way the application is constructed, the possibilities it will offer to users, and how those possibilities are presented. It will also enable us to make decisions about who will be using our application, how we expect them to use it, the kinds of experiences we want them to have, and the value we expect them to derive from those experiences.

Application Behaviour

This is what most people think of when they think of “functionality” - what does your application do? Under this category, many familiar headings will appear: API testing, unit testing, integration testing, and so forth. But, again, we should think in terms of the kinds of facts, and what they are for.

Here, context is the key concept. Different contexts have different scopes, and as such different facts we care about. As the context changes, the scope changes, and with it, so do the human beings involved.

Units

While it’s true a developer is not necessarily a user of an application he is building, he is nonetheless a human being that interacts with the application. His interactions are as a builder. So, he is intricately familiar with its structure, its component parts, and the relationships between them.

As he makes changes to the application, he will want to know many things: is the unit of change producing the results I expect? Is the unit affecting the units around it? Are the changes affecting the structural integrity of the whole? Are there any unexpected outputs? And so forth. A developer will want to answer these questions quickly and precisely, as a part of his workflow. To be productive and efficient, he cannot wait weeks for each change to land in the hands of a user before discovering these answers.

This is the purpose of unit tests. They are a programmatic tool enabling the application to provide the developer with immediate feedback about his changes. They provide facts that are essential to decisions the developer makes during development process.

Integrations

Integrations are a level above units, and involve the orchestration of two or more units. This is sometimes called “functional testing”, in the sense that multiple units combine to produce whole conceptual “functions”. Think, for example, of a unit that generates a numeral every time it executes, and another that adds numerals it receives to an accumulator to produce a new number. Independently, each is useful in its own right. Together, they produce something neither could do independently: an infinite accumulator. This is a ‘function' because it performs some relatively simple transformation, but requires the integration of multiple units to accomplish the goal.

The purpose of an integration test, then, would be to prove that what I described to you just now, is actually what happens when you put these two units together. This would be the first question you might ask: “does it do what you say it does?”. Also, you might want to answer some (but not all) of the questions from the section on units. Typically, the developer is person interested in the answer to these questions. In many development organisations, the developers will therefore also write their integration tests. However, in some cases, integration tests can be written by test engineers, or by other specialised roles tasked with the responsibility for certain kinds of integrations.

Systems

Systems are orchestrated collections of integrations that deliver a coherent value to a user. Think of a computer system. It is the assembled result of hundreds of integrated functions, and thousands (or tens of thousands) of individual units. From component parts like capacitors, to integrated circuit chips, to the motherboard and its ‘daughter’ boards, to the chassis and all of the peripherals, the final result is a product that you and I can use to accomplish an infinite number of different goals.

As we move into wider and wider contexts, the questions we want to answer will grow in scope along with the move. System Testing is essential for discovering the answers to questions that cannot be asked at smaller contexts like the unit and integration level. It is one thing to know how much stress a wheel can undergo at speed. It is another thing to know how a steering assembly will behave when wheels of a certain weight are attached to the axel. But it is yet another question to know how an entire vehicle will perform with that steering assembly and wheels, on a test track. This, then, is system testing.

At this level of testing, the number of questions we will want to answer will increase asymptotically. And the ways of deriving answers to those questions is myriad and diverse. There are whole categories of questions one can ask about a complete system. Internet lists will often display these as separate kinds of testing on testing lists: performance, security, compatibility, reliability, user journeys, and so forth. As such, tests may be written and performed by any number of different kinds of people, from developers and testers, to application specialists, or industry reviewers, or specialised auditors (think Underwriters Laboratories).

Human Behaviour

So far, we have only been dealing with questions we might have about the behaviour of the application under test. But there is another aspect of testing that is often neglected when considering the the testing of applications: namely the behaviour of the human beings who use them.

The facts we’re most interested in getting out of this kind of testing are different from the kinds of facts we get out of application behaviour testing. In this case, we want to know if a user is able to accomplish a goal, what kinds of processes and functions he prefers in accomplishing his goal, and even his feelings and opinions about various features of the application.

User Experience

In the same way we often make assumptions about the way an application will behave when we subject it to certain conditions, we also often make assumptions about the way a human will behave when we present them with an application user interface, and give them a goal. This is where user experience testing comes in. It’s purpose is to compare our assumptions about our users, to the real behaviours exhibited in real world conditions.

Today, this is sometimes called “usability” testing (because of the modern trend of assigning an “ility” adverb to every aspect of an application). This might involve walking a user through a particular task and making a note of where he struggles to complete it, or asking a user to point out where certain things are found in the application UI, or it could be as complicated as giving a user a complete goal to accomplish and recording his journey from start to completion.

Quality vs Quantity

The two main varieties of “usability” testing are “qualitative” and “quantitative”. The former is traditional “UX” testing, in which user feedback is collected and used to improve the design of an application with a focus on the qualities of the individual user experience. The latter is designed to identify problems with process efficiency or performance bottlenecks, and involves collecting benchmark indicator data that can be tracked over the span of a large cohort of users, or over a large number of user journey iterations.

UX Testing Techniques

Here are some common techniques for collecting either qualitative or quantitative facts about user experience:

Concept Testing: Select users are presented with wireframes, paper sketches, or physical mock-ups, and testers engage in conversations with users about the proposed concepts.
A/B Testing: Two different versions of an application UI are deployed to a UAT environment or production environment, and select sets of users will be presented with either the “A” version or the “B” version. Evidence of the users' journeys are collected and evaluated to determine which version is desirable.
“Heat Map” Analysis: This is one of the “quantitative” techniques. This essentially aggregates a record of the places on the application where clusters of user interactions have taken place, across a cohort of users. It provides information about common user behaviours, and places where the application may be causing behavioural bottlenecks.
Card Sorting: This technique is designed to improve the information architecture of a product, by evaluating the way your users understand the concepts, topics, and categories of information employed in the application.
Moderated Testing: This testing is an observational technique meant to provide direct feedback about the mechanics of a user journey. Users are given specific tasks to accomplish, and testers record the relative ease or difficulty of accomplishing those tasks. Often additional subjective feedback is also collected as part of this process.

Application-User Relationship

The third and final question any good development team will be interested in asking, is what sort of relationship exists between the user and your product. This question is a bit of a meta-question, in that it arises out of a synthesis of the first two questions, and a third question. Namely, market demand and business needs. The goal is to discover how the first two questions connect in the middle to deliver value, and the facts needed to answer that question are derived from the outcomes of the first two questions, combined with what can be discovered about the broader needs of customers.

This is a question asked more often by sales and service teams, or by product owners, than by development teams and testers. The kinds of facts that would be interesting, and the sorts of questions that would surface those facts are going to be business delivery oriented, but focused on the product lines provided the business. This is sometimes known as “market testing” or “business development”, and is beyond the scope of this article.

Conclusion

The perceptive reader will note the gradually expanding scope of each of these kinds of “tests”. The further up you go from the unit, the greater your view of the landscape. Simultaneously, the further up the “stack” you go, the less specific and granular your understanding of that landscape. From a plane you can differentiate farmland from urban areas, but you would’t be able to tell me where the occupant of flat 10 in building 4 of the urban area has parked his car, or how much grain farmer Smith has left this quarter, for his cows. Likewise, farmer Smith could tell me about his grain, but could not tell me what the population of the urban area is, or whether there is enough roadway to handle the new development in region 3.

Each context offers different kinds of facts, and as such, different kinds of questions can be asked within them. This requires different kinds of tests – and different kinds of testers – who are trained to ask the right questions, and to interpret answers looking for the kinds of facts within those answers that help to make decisions about the product being tested.

15 KiB Raw Blame History Unescape Escape