- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
The Change Was Low Risk. It Took Down Three Services. Here's Why That Keeps Happening.
Change management exists to prevent service disruptions. Change-induced incidents account for a significant share of all production outages. That's not a process failure — it's an information failure. Change managers can only govern what they can see. CSDM is what makes the full picture visible.
Standard Change. Normal Approval. Four Hours of Incident Response.
The request came through on a Tuesday afternoon: a routine configuration update to the shared authentication database cluster. The change manager reviewed it, checked the infrastructure risk score — medium — and approved it for the Thursday evening maintenance window. The submitting team had done this type of change a dozen times. It went through the standard approval path without escalation.
Thursday evening, 9:47 p.m.: the update is applied. By 10:02 p.m., the service desk queue is filling. The benefits enrollment portal is returning errors. The case worker dashboard has gone read-only. The document upload service is timing out. Three separate incident tickets are opened by three separate teams before anyone connects the dots.
At 10:34 p.m., someone on the infrastructure team realizes the change went to the authentication cluster — the shared technical service that all three affected applications depend on. The revert takes eighteen minutes. The incident bridge call takes another hour. The post-incident review finds the root cause immediately: the change was correctly scoped at the infrastructure level, and completely blind to its service-level blast radius.
The change manager didn't make a mistake. They made a decision with incomplete information. The question isn't whether they should have known — it's whether the system should have told them.
✦ ✦ ✦
The Problem
Change Management That Evaluates Components, Not Consequences
Traditional change risk assessment works at the wrong level of abstraction. A change request says: "modify this configuration item." The risk assessment answers: "how risky is it to modify this configuration item?" That's a useful question. It's also the wrong question for preventing service disruptions.
The right question is: "what services depend on this configuration item, how critical are those services to the business, and what happens across the entire dependency chain if this change introduces a problem?" That question requires service context — and service context requires CSDM.
The Two Ways Infrastructure-Centric Risk Assessment Fails
Over-approval: Changes that look routine at the infrastructure level get approved without scrutiny, when the affected component is actually a shared dependency for multiple critical services. The result is exactly what happened Thursday evening — a change approved as medium risk that caused a high-impact multi-service incident.
Under-approval: Changes that affect large but low-criticality infrastructure stacks get over-escalated, because the change manager can see the scale but not the criticality. This creates bottlenecks for genuinely routine work and trains teams to view the change process as a bureaucratic obstacle rather than a useful governance checkpoint.
Both failure modes stem from the same gap: the change management system can see what's being changed, but not what depends on it or what that dependency means for the business. Closing that gap is what CSDM-enabled change management is built to do.
The Explanation
What Changes When Change Management Has Service Context
When CSDM relationships are current and accurate, every configuration item in the CMDB has a traceable path upward: from infrastructure CI to technical service, to application service, to business application, to business capability. A change request that references a CI now carries implicit information about everything above it in that chain — and change management can surface that information automatically.
The Thursday evening incident becomes preventable at the point of change submission, not at the point of post-incident review. The change manager sees the same three application services and the same two business capabilities that caused four hours of incident response. They see that a similar change caused an incident eleven months ago. The automatic risk score escalates to high. The CAB reviews it. Additional testing is required. The change happens in a properly staged, properly approved window — or it doesn't happen until it can.
Three Specific Capabilities CSDM Unlocks in Change Management
Service criticality-weighted risk scoring. Not all services are equal, and change risk shouldn't be calculated as if they are. CSDM allows services to be classified by criticality — driven by the business capability they support and the operational importance of that capability. A change affecting a configuration item that supports a Tier 1 citizen-facing service is scored differently than an identical change affecting a Tier 3 internal reporting tool. The change is the same. The risk is not.
Shared dependency detection. Some of the most consequential change-induced incidents happen because nobody realized that two applications shared a technical service dependency. CSDM makes shared dependencies explicit. When a change targets a technical service, the system identifies every application service that depends on it — revealing cascade potential that would otherwise only become visible at 10 p.m. on a Thursday. Overlapping changes to the same shared platform can be flagged and coordinated before they compound.
Historical pattern integration. Change management that has access to service context can also learn from service-related incident history. If the authentication platform has been associated with incidents following similar changes, that pattern is surfaced during the risk assessment — not buried in a post-incident report that nobody reads before the next change is approved.
"A change manager who can see the service blast radius before approving a change is a different kind of change manager. They're not reviewing paperwork. They're making an informed risk decision."
The Solution
The Prerequisite, and Why It's Not Optional
Everything described above depends on CSDM service relationships that are accurate and current. This is the part of the change management improvement story that doesn't appear in vendor demonstrations: the automated risk scoring is only as good as the service model feeding it. A risk assessment built on outdated or incomplete CMDB data produces confident results that are wrong in ways that are hard to detect until an incident proves it.
Two governance practices are non-negotiable for CSDM-enabled change management to work reliably.
Service ownership aligned to change review. The service owners whose records inform change risk assessments should be active participants in the change process — consulted when changes affect their services, responsible for validating that their service's dependency relationships are current. When a change manager sees that a technical service has three dependent application services, those application service owners should be visible, reachable, and expected to validate the assessment.
Post-incident feedback into the service model. Every change-induced incident is a data point. When an incident reveals a service dependency that the change risk assessment missed, that missing relationship should be added to the CMDB before the incident closes. Over time, this feedback loop creates a service model that gets more accurate with every incident — turning operational failures into governance improvements rather than isolated bad-luck stories.
Summary
Back to Thursday Evening
The change that went to the authentication database cluster on Thursday evening was not a reckless decision. It was a reasonable decision made with incomplete information. The change manager didn't know about the three dependent application services or the two critical business capabilities because the system didn't tell them — because the service model wasn't connected to the change workflow in a way that surfaced that information automatically.
With CSDM relationships in place and connected to change management, that specific decision never gets made the same way. The change request surfaces three critical app service dependencies and two high-priority business capabilities. The auto-risk score escalates to high. The CAB reviews it. Expanded testing is required. The authentication cluster is updated in a properly controlled window, with rollback plans confirmed and service owners notified in advance.
No 10 p.m. incident. No four-hour bridge call. No post-incident review that surfaces a dependency everyone should have already known about. Just a change that went through the right approval process because it had access to the right information.
Give change management the service context. It will do the rest.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
