Working with SRM services

  • 릴리스 버전: Australia
  • 업데이트 날짜 2026년 03월 12일
  • 소요 시간: 5분
  • A service represents a functional outcome like networking, payments, or HR services, that is owned by a team. To deliver that outcome, a service can contain one or more technical components like a user authentication service, or a piece of shared infrastructure like a database.

    Service Reliability Management (SRM) works with integrations to prioritize and route alerts to the relevant responders. It follows up with escalations until the alert is acknowledged and someone responds. When you create or add a service in SRM, it must reflect a service in your SRM infrastructure.
    주:
    You might want multiple tool integrations to monitor each technology management service and receive events from those tools. See Working with integrations in SRM for more information.

    In addition, you can create reliability metrics for the service. See Working with reliability metrics.

    Tying a team and policies to that service makes it easier to divide responsibilities and track technical outcomes. It also makes it easier to automate response routines and focus on who you notify and when.

    The state of an exiting service is inherited. The state of a created service in SRM is None.

    Services Overview

    그림 1. Information on the Overview tab
    Services page showing the list of your services

    The cards on the Overview tab display the following metrics. By default, the list view shows information related to the Your services card. Select a different card to view different information in the list view.

    • Your services: Count of all the services you or your team manages and monitors for reliability.
    • Services with active incidents: Services with open incidents sorted in the following order:
      • Business criticality - most critical first.
      • Number of active incidents - highest first.
      • Percentage of error budget remaining - lowest first.
    • Services with critical alerts: Services with open alerts sorted in the following order:
      • Business criticality - most critical first.
      • Number of alerts - highest first.
      • Percentage of error budget remaining - lowest first.
    • Services with open changes: All the services your team manages and monitors.
    • Services with low error budget: Services with less than 25% error budget remaining.

      The error budget metric is represented as the amount of service level objective (SLO) that you can spend over a specified time. It can be used to manage release velocity.

    주:
    To refresh the card and list values, select Refresh Refresh icon.
    You can interact with the list in the following ways:

    For more information about individual service details, see Edit service details form.

    Services list view definitions

    The columns include the following details:
    • Service: Name of the service.
    • Class: Service instance or technology management service.
    • Business criticality: Importance of the service to the business.
    • Open alerts: Number of open alerts assigned to the service.
    • Open incidents: Number of open incidents assigned to the service.
    • Error budget remaining: Percentage of error budget remaining for the service.

    Service reliability

    The Service reliability tab is a customizable dashboard showing high-level service performance. For more information about the dashboard, see Visualizations in the Service reliability dashboard.