I was appalled – the QA engineer was playing a video game instead of doing real testing! But it turned out that I rushed to conclusions too fast.
The video game was actually part of the test plan. The QA engineer was testing the compatibility of our PC software with the new video drivers and the game he was playing was known as a stressor of video cards.
I had just joined that company and this was new territory for me. My previous experience was in the aerospace industry where we had very formal, and strict quality assurance practices. So this turned out to be a learning opportunity for me. What was different between the 2 companies?
The level of risk.
In aerospace, the level of risk if you have faulty software is really high. Planes could crash and people get hurt. We absolutely had to get it right and prove to our stakeholders that we took all precautions possible to test all the systems.
For that PC software company, the risk laid out in a different area – time to market. We had the ability to push changes out, but if our competitors released before we did, we might be put out of business. So, the emphasis was on speed and efficiency of testing.
I’ve learned that it’s important to design a test strategy to match the relevant risks. This practice is called risk-based testing. I use risk-based approaches to optimize testing – focusing more on the higher-risk items and, if possible, reducing the investment in the areas that have less risk.
This method involves rating the quality risk of features, components, or test plans, then using that risk information to prioritize testing. I generally rate risk by evaluating two parameters: probability and severity:
- Probability shows how likely a problem is to happen, and can be estimated based on factors like history and complexity.
- Severity is based on the customer impact and can be estimated by how important the feature is to the customers, how many customers would be impacted, or the gravity of a problem (security, privacy, data integrity, safety, etc.)
We can then put a risk-rating on each item using a chart like the one shown below.
For example, if a feature has a high probability of failing, and there is a high severity if it does fail, then it has pretty much the highest level of risk that you can encounter. On the flip side, if you have a feature that is rarely used, and you haven’t done much development in that area, you might assess this as low risk.
Rating the severity
The severity ratings will vary greatly depending on the type of software that you are developing. Airplanes, autonomous automobiles, weapon systems, and medical devices generally have much higher severity levels than a “free-to-play” mobile game.
Assuming that I’m working on software that doesn’t have these types of safety risks involved, I’ll generally look at several factors for the high-severity items:
- Sign up and Sign in – If your customers can’t access your app, that’s pretty severe.
- Payments – we should protect our customer’s money as if it were our own.
- Security – data breaches only benefit the bad guys.
- Usage analytics – usually look at the top 10 features used by customers.
As an aside note on usage analytics, I worked on a product that had about 150 features. I gathered the usage logs from all of the servers and found something surprising. Out of all the features, 93% of the product usage came from just 3 of those features. So we made sure to test the heck out of these 3 features and de-emphasized the other features.
Of course, you need to use the analytics as just one input to your risk factor. Signing up for an account, or entering a credit card, might only be done once by a new customer, but those interactions are vital to get right.
Rating the probability
How likely a problem is to occur, or how likely a feature is to break, is a guessing game. But, hopefully, you can start with a few of these ideas:
- History – you might be able to query your source-control system for how often your modules have bug-fixes checked in. That is usually an indication that code is susceptible.
- New code – Generally, the new code is going to be more buggy than the existing one, just because that new code hasn’t been put to the test by your customers.
- Ask the developers – I generally ask the developers where they think we could use a better design or what might be prone to bugs.
- Ask the support team – If you have a tech support team, they can usually tell you what areas give them the most pain.
Once you have a feeling for the risk, you can adjust your testing approach. Obviously, you’ll want to invest more energy into the higher risk items. I usually start with a white-box approach, partnering with Development to make sure we took the time to do thorough code review and unit tests.
Then, I’ll work with the test team to make sure they are 100% confident in the testing for these high-risk areas. Sometimes, there is a temptation to give equal coverage for all areas, but the risk-based approach calls for more concentrated testing on the high-risk areas.
Once the team is confident in their coverage, I usually try to get a second (or third) set of eyes on that testing as well.
Adding testers to a project can be tough. Risk-based testing usually adds to the testing demands for the higher risk items, and there are generally no extra testers laying around. In such a case, I’ll start by pulling testers off of the lower-risk areas, but this strategy can only go so far. Conversely, if something is low-risk and you don’t test it – it could move higher up on your list. Using a testing services provider can help give you that second pair of eyes on your high-risk items. And also, cover testing for your lower-risk features – which will make your testers and development team more available to tackle the high-risk features.