Integrating Defect Management into SCRUM

SimonMorris · ‎10-18-2011

In a change of subject from ITSM... now for something completely different.

I've been doing some research into integrating Defect Management (as in Software bugs) into SCRUM. I thought I'd offer up what I've found so far in the hope that some commentary and improvements would follow.

Introduction to the subject

SCRUM is an Agile development process that aims to produce software that is rich in functionality and low in defects. The methodology is based around prioritising features, and favouring the development of those features over long term building of specifications and requirements gathering.

Importantly it reduces the software development lifecycle by testing earlier.

It does this by removing the long testing phases associated with traditional software development methodologies. Instead creating teams of cross-functional engineers who can iteratively code, design and test.

Defect Management is the process of detecting, logging, categorizing, prioritizing and resolving defects in software.

SCRUM terminology

User Stories: Small, discrete descriptions of user functionality. Stories will be scored using a points system depending on the complexity of the implementation.
Sprint: The smallest unit of time in SCRUM planning. A timeboxed period in which to complete a pre-determined number of Stories.
Sprint Backlog: The list of outstanding work to be completed during the current sprint
Product Backlog: The overall list of outstanding work relating to the project/product. The Product Backlog is maintained by the Product Owner

The Wikipedia page on SCRUM has a nice visualization of the Product Backlog, Sprint Backlog and Sprints

Roles in SCRUM

SCRUM defines the following roles:

Product Owner: Represents the voice of the business, and the users of the system. Is able to prioritise features (represented as Stories)
Team: Responsible for the implementation of the project/product and the creation of the code to support User Stories. The teams "success" is measured on their velocity, how many story points they are able to implement during a given sprint
ScrumMaster: Responsible for leading the team, and removing any impediments to progress

When is a Defect not a Defect

Because of the tight integration between development and testing a number of defects should be identified and resolved during the sprint.

Testing should be written before development starts. The developer implementing the story will test it during coding, and Quality Assurance is either a peer developer, or perhaps a dedicated tester working as part of the same SCRUM.

If a test fails at this point it's important to classify this defect differently to one found after the code is release (for example by a user).

Strictly speaking the code hasn't been declared as "done" at this stage - the team hasn't said it is ready for use. However, if metrics need to be taken (for example the Defect Detection Percentage you may need to record these test failures as a type of defect

Although there is a defect in the code, it's important to differentiate this from a defect that "escapes" the sprint.

Ultimately, in an environment where (a) tests are written before the code and (b) tests are good enough to find defects, by the time the code is accepted by the Product Owner we will have already been through some aggressive testing.

It is the responsibility of the Product Owner not to accept code if tests are outstanding and it isn't "done".

Anticipating Defects in Sprint planning

The nature of SCRUM is towards writing the functionality and features that matter, and on delivering a useable product. It's assumed by teams that the majority of time will be spent on developing new features rather than fixing defects introduced at an earlier stage.

For new products it's certainly true that nearly 100% of the time will be spent on developing new features. However, over time as escaped defects arise and come back to the team the sprint planning must consider the amount of time spent on resolving defects.

This image from agileevidence.com shows the introduction of escaped defects into the sprint planning, and that amount of effort increasing over time as more escaped defects are found

Types of SCRUM bugs

When deciding on how to manage defects as part of the SCRUM process it is important to identify the type of Defect, and to follow the appropriate action.

Type	Description	Action
Simple	The implementation is wrong, but the story was complete and correct	Doesn't count towards team velocity
Incomplete Story	The implementation is right, but the story was missing a critical detail.	The team velocity shouldn't be adversely impacted
Wrong Story	The implementation was right, but the story was a bad idea.	The team velocity shouldn't be adversely impacted
Changed Story	The implementation was right, and the story was right, however the requirements changed over time	The team velocity shouldn't be adversely impacted
Legacy Bug	No story exists for this feature as it was coded before SCRUM was adopted	Should count towards velocity and be treated differently to "In SCRUM" defects

Prioritizing Defects

As escaped defects are found and logged it's important to prioritize them correctly in order to guage when and how they should be resolved.

Firstly a priority scheme should be used to identify Critical defects that must be addressed and resolved immediately. This may involve using resources already scheduled on a sprint, therefore reducing the number of features we aim to deliver

Therefore this should only be done for the highest priority of Defect, and having a repeatable way of assigning priority will help here.

In the same way that User Stories should be rated (to determine priority and importance of new functionality) a similar scheme should be used for defects.

This is especially important as bugs (with any other severity than Critical) will be considered alongside User Stories for inclusion into a particular sprint.

With a finite amount of engineering resource (For example a team of 4 engineers sprinting for a month will have 640 hours available (40 hours * 4 weeks * 4 resources)) it is important to assign that resource to the highest priority feature or defect.

Ultimately this is the decision of the Product Owner who will choose User Stories (both functional and defect related) to put into the Sprint Backlog.

Using Michael Lants example

To prioritize a Defect the Scope and Severity should be considered

Scope - How many users, customers or how much of the system is affected

Value	Guideline
5	Affects most or all users and/or a very larger range of system functionality
4	Affects a large set of users and/or large range of system functionality
3	Affects a moderate set of users and/or moderate range of system functionality
2	Affects a small set of users and/or a small range of system functionality
1	Affects a minimal set of users and/or a very small range of system functionality

Severity - How critical is the issue

Value	Guideline
5	Data loss, data corruption or system unavailable
4	Important functionality is unavailable with no workaround
3	Important functionality is unavailable but has a reasonable workaround
2	Secondary functionality is unavailable but has a reasonable workaround
1	Cosmetic issues or some functionality unavailable but has a simple workaround

With this information we can evaluate the priority of the defect.

and determine the correct action...

Whereas the priority of a User Story is calculated differently (Urgency and Business Impact) it is important to use the same ranking method so that the Product Owner is able to evaluate what should be added to a particular Sprint.

Handling the Defect

Handling Critical Defects	If a critical defect is found, by definition a very serious condition exists that affects multiple people. All effort must be made to fix the issue straight away at the expense of progress on the current sprint
Handling Simple defects	Simple Defects are characterized as having been introduced to the system during a sprint. The defect remained undetected during the developer, peer and QA testing and the defect is said to have "escaped" the sprint. In this case the effort involved in resolving the defect is not recognized as team velocity, as the team is repaying a debt incurred in an earlier sprint. The defect should be associated with a User Story and placed into the Product Backlog. The User Story should be assigned zero points. During the sprint planning the team should pick the Story relating to the defect according to the priority set by the Product Owner. Naturally with the inclusion of zero point stories the teams velocity for that sprint will be reduced. As part of the resolution of the defect the team should also improve the unit testing for that code so that future tests will pick up these problems.
Handling Incomplete or Wrong Story bugs	Some defects will be introduced to the system due to an incomplete, wrong or changed user story. An example of a "Incomplete Story bug" would be a story that is missing key technical details so that the implementation follows the intent of the story but the outcome is wrong. An example of a "Wrong Story Bug" is a concept that is described well in the story, implemented correctly but the entire concept was flawed and should be removed. Lastly an example of a "Changed Story Bug" would be a correct implementation of a story that then generates new, changed or unforeseen use cases which don't work correctly In these cases the team velocity for future sprints shouldn't be impacted negatively. A User Story should be associated with the Defect and placed into the Product Backlog The Story should be scored accordingly and the Product Owner can prioritize and associate the story to a sprint alongside stories that describe functionality enhancements.
Handling Legacy Defects	Legacy Defects are characterized as having been introduced to the system during a non-SCRUM development cycle. Because of this there won't be a supporting User Story that describes the functionality that the code was supposed to provide, although it may be possible to use Version Control Systems to identify where the bug was introduced. To provide the balance between clearing legacy bugs, and providing new functionality the team shouldn't be "penalized" by discounting the work required to resolve the bug from the velocity of the sprint. A new User Story should be created for the Defect and placed into the Product Backlog. The Story should be assigned points according to the estimated effort involved in resolving the defect. The Product Owner can then prioritize and assign the Story into a Sprint. The work involved in fixing the defect counts towards the teams velocity.

Measuring the effectiveness of Defect Management

This subject is probably expansive enough for another blog post, but lets go into detail into one KPI, and summarize the others.

Defect Detection Percentage (DDP)

DDP defines the ratio of defects discovered prior to release (by the SCRUM team) and after (by customers)

To be able to calculate DDP the following metrics must be taken:

Affected version of software in which the defect was found
The release date for each version
Number of defects detected at the point of release
Number of escaped defects

The DDP is calculated using the following formula


DDP = Number of defects at date of release / Number of defects at date of release + Number of escaped defects

So, if testing prior to release detected 100 defects, and a further 20 were detected once the code was released as "finished" the DDP would be calculated as


DDP = 100 / 100 + 20 = 0.83 (83%)

As the DDP will change over time (as more escaped defects are found) the metric is best displayed in the following method

See the Google Spreadsheet

Other Defect Management KPIs include

Defect Removal Efficiency
Defect Find Rate
Escaped Defects Found
Mean age of unresolved defects

Additional Reading

Thanks to the authors of the following articles:

As I dive deeper into Scrum and Defect Management I'll post again. Thanks for reading this far!

Integrating Defect Management into SCRUM

Table of Contents