Get started with Service Reliability Management
Summarize
Summary of Get started with Service Reliability Management
Service Reliability Management (SRM) in ServiceNow Xanadu release provides IT Operations and DevOps teams with a unified interface to monitor service health, manage service level objectives (SLOs), and streamline incident resolution. SRM helps teams maintain agility, performance, and uptime by contextualizing incidents and service metrics within service-level goals.
Show less
Key Features
- SRM Interface: A centralized home page and navigation experience offering quick access to critical alerts, assigned work, and team information tailored to your role and permissions.
- Setup Guide Modules: Step-by-step modules for onboarding including setting up your homepage, services, and teams to ensure a smooth SRM adoption.
- Team and Service Management: Tools to create and administer teams responsible for services, define service parameters, and manage on-call schedules and escalation policies to guarantee timely incident response.
- Service Relationships: Visual map canvas to configure service dependencies, helping assess the impact of child services on parent services.
- Third-Party Integrations: Integration with external monitoring tools like Datadog and ServiceNow Cloud Observability to consolidate alerts within SRM.
- SLO, SLI, and Error Budget Management: Establish and monitor service performance goals and allowable failure times to meet contractual obligations.
- Alert Automations: (Requires Alert Automations app) Define alert conditions and rules for Application Performance Monitoring (APM) tools to automate notifications and incident creation.
Practical Use and Benefits
By adopting SRM, ServiceNow customers can quickly set up and manage their service teams and services within a single platform, ensuring that critical alerts and incidents are promptly addressed by the right teams. The integration capabilities enable centralized alert visibility, improving incident response efficiency. Setting SLOs and monitoring error budgets help maintain agreed-upon service reliability levels, supporting business continuity and customer satisfaction.
The SRM home page serves as the operational hub, providing relevant updates on services, schedules, maintenance, and team activities, enabling users to stay informed and act swiftly.
SRM accelerates your path to viewing service health in the context of service level objectives and incident resolution. Helps IT Operations and DevOps teams deliver on the promise of agility, performance, and uptime.
Get started with SRM to understand what are the different sections on the SRM interface.
For more information on roles, see SRM roles and responsibilities.
Basic SRM tasks
| Step | Description | Reference |
|---|---|---|
| Set up guide modules | Our Setup guide modules for your Homepage, Services, and Teams landing pages show you how to add a team or a service. It includes all the key milestones to make sure you’re set up for success. | Add an SRM team |
| Your homepage | Your home page is where you find the things most important to you. Like the Services with critical alerts and incidents, work assigned to you and your team. | SRM home page |
| Learn the ins-and-outs of SRM navigation. | You can get familiar with the different sections and elements of the SRM interface. The sections and elements are used throughout our documentation. | SRM interface |
SRM helps you when you must create and administer teams, services, and integrations.
| Step | Description | See this |
|---|---|---|
|
Manage an application or technical service in SRM |
Define the basic tasks and parameters that make up your service and how it should behave. | Add a service to SRM |
|
Set up a SRM team |
Set up a team. Teams are responsible for the issues that occur in the associated services. | Add an SRM team |
|
Set up on-call schedule and escalation policies |
Create an on-call schedule for your team to ensure to ensure that dedicated support team members are available to resolve issues as they arise. You can set up an escalation policy for your team so that at least one team member is engaged in incident response. | Create your SRM On-call schedule |
|
Configure service relationships |
Use a map canvas to add, configure, and arrange services. You can add child services that depend on parent services. | For more information, see View impact of child service on parent service. |
|
Integrate services with third-party monitoring tools |
Set up a third-party integration, such as Datadog or ServiceNow Cloud Observability, with SRM so that alerts are available to your teams within SRM. | Add an integration to SRM. |
|
Establish SLOs, SLIs & error budgets for services |
Establish goals for how well your service should operate. Also specify the maximum amount of time that a technical system can fail without contractual consequences. | For more information, see Service Level Objective Management. |
|
Set up alert automations
Note: This functionality is only available if you have installed the Alert automations application. |
Alert automations enable you to define alert conditions. Set up alert rules for each APM tool to define the conditions when the APM tool should send notifications to SRM. | The Alert automation application is available from the ServiceNow Store. |