View SRM Reliability metrics

  • Release version: Xanadu
  • Updated August 1, 2024
  • 3 minutes to read
  • View a service level objective (SLO) and service level indicator (SLI) that your or your team owns.

    Before you begin

    Role required: Responder, Manager, or Administrator
    Note:
    Administrators can view any SRM SLO.

    Procedure

    1. Navigate to Workspaces > Service Operations Workspace.
      You are taken to your SRM homepage.
      Note:
      If you have other SOW applications, and depending on your assigned roles, that homepage may not be the SRM homepage. It is the SOW homepage instead, with SRM alerts and incidents included in your metrics. In that case, to view SRM specific areas, select SRM modules from the left navigation pane.
    2. From the left navigation pane, select the services icon (Services icon).
    3. Select the Reliability metrics tab.
    4. Open a Service level objective.
    5. View the SLO header.
      The top header contains service information for:
      • SLI type:
        Type of the SLI based on which the metrics are calculated. The available types of SLI are as follows:
        • Availability: Percentage of time your service is available. (Default)
        • Errors: Measurement of how frequently service error occurs.
        • Latency: Time taken to service a request. The actual amount of time that elapsed.
        • Saturation: Measurement of your system fraction, emphasizing the resources that are most constrained.
        .
      • SLO Type:
        Type of SLO based on which metric you choose. The available types of SLO are as follows:
        • Duration: The amount of time the service spends without breaching. It’s the only value available.
          Metrics for Duration are as follows:
          • Objective (percentage): Percentage of the desired SLI performance.
          • Error budget: Displays, in days and time, how much error budget is left.
        • Count: Number of occurrences in a given compliance period.
          Metrics for Count are as follows:
          • Limit (occurrences): Number of occurrences after which a breach has occurred.
          • Remaining breach occurrences: Number of occurrences left.
      • State:
        State of the SLO. Choices are:
        • Draft: The SLO isn't running in your instance yet. You can add new SLIs or update existing SLIs and you can delete the SLO.
        • Running: The SLO is active in your instance. You can edit, retire, or delete the SLO.
          Note:
          Editing an SLO in the running state retires it and a new copy is created.
        • Retired: The SLO is no longer running in your instance. You can reactivate it.
      • Service: Service associated with the SLO.
      • Reliability: How reliable the service is.
        • Stable: All SLOs in this Service have more than 25% of the error budget still available.
        • At RISK: All SLOs in this Service Have EB left, and at least one SLO for this Service has less than 25% of the error budget left.
        • Critical: Any one SLOs in this Service has burnt through its error budget

      From the header you can Delete SLO, Retire, or Edit.

      If you delete an SLO any associated SLIs are deleted as well.

      If you retire an SLO, it changes the state. You can Re-activate it from the SLO page or later from the Reliability metrics tab.

      Note:
      If you edit an SLO, it changes the state, retires the SLO record, and opens a new copy for editing. See Create SLO, SLI, and Error budget policies for more information. You can reactivate the original SLO by returning to the Reliability metrics tab.
    6. View the Overview tab and select a Historic period from the list menu.
      The periods listed are from the date that the SLO was created to the present in the selected increments.
    7. View Summary metric cards for the SLO.
      The metrics show aggregate or average values depending on the type of SLIs chosen and error policy thresholds chosen. See Create SLO, SLI, and Error budget policies for more detailed information.
    8. View the Service level indicators section.

      This section lists your indicators by name.

    9. View the Service level objective (SLO) history section.
      Depending on your SLO type, this section displays the following in bar graph reports:
      For Duration:
      • Error budget used
      • Error budget remaining
      • Burn rate
      For Count, Count by periods, or Count by occurrences:
      • Cumulative breach occurrences
      • Burn rate
      • Alerts, incidents & changes impacting this service

      Selecting into one of the analytics line chart reports shows the values on that day.

    10. View the Details tab.
      The fields in this tab are auto-populated and uneditable. See Create SLO, SLI, and Error budget policies for detailed information on the fields.
    11. View the Service level indicators tab.
      This tab lists the SLIs associated with this SLO.
    12. View the Error budget policy tab.
      In this tab, you can add more thresholds and edit or delete existing ones.
      Error budget policy page.