An overview of alerts for Event Management operators

  • 릴리스 버전: Australia
  • 업데이트 날짜 2026년 03월 12일
  • 소요 시간: 8분
  • As an Event Management operator, you need to understand how an alert is generated from an event, what to look for in an alert, and how alerts can be grouped together.

    This is the first lesson in the Event Management tutorial.

    Lesson 1

    Overview icon

    An overview of events and alerts

    Lesson 2 Overview of BS icon

    An overview of application services

    Lesson 3 Overview operator icon

    Event Management operator workspaces

    Lesson 4 Overview of what operators do icon

    What operators do

    Your organization already has an event monitoring tool in place, such as Microsoft System Center Operations Manager (SCOM), Nagios, SolarWinds, and so on. When an issue occurs on your network, such as a computer going down or a database failure, the event monitoring tools send events to your ServiceNow instance. The Event Management application processes the events according to the settings that your administrator configured, and then generates alerts. An alert is an indicator that the issue requires some type of action.

    그림 1. Alert generation
    An operator view of Event Management

    As an Event Management operator, your role is to view alerts and, depending on how Event Management is implemented in your organization, take an action to help resolve the underlying issue or notify someone who can. Later in this tutorial, you will see the phases of a typical alert management process.

    Alert priority and severity

    The two most common characteristics of an alert are the priority and the severity.
    • The priority of an alert is a score that helps you determine how important the impact is to application services. Multiple factors determine the alert priority score. Your Event Management administrator can configure the algorithm that the Event Management application uses to calculate priority.
    • The severity of an alert is an indicator of how serious the underlying issue is. The event monitoring tool in your organization usually sends severity values with the event, which then gets carried over in the alert. These are the default severity types that you will see in this tutorial:
      Severity Description

      Resource icon Critical

      The resource is either not functional or critical problems are imminent.

      Functionality icon Major

      Major functionality is severely impaired or performance has degraded.

      Minor icon Minor

      Partial, non-critical loss of functionality or performance degradation occurred.

      Warning icon Warning

      Attention is required, even though the resource is still functional.

      OK icon OK

      No severity. An alert is created. The resource is still functional.

      Clear icon Clear

      The alert no longer needs action.

    Correlated alerts

    Some alerts are related to each other. For example, if a router goes down, several separate alerts could be generated, one for each server connected to the router. All of these alerts are related, or correlated. To help you manage correlated alerts, Event Management can automatically group them and establish a two-level hierarchy with one root alert, called the primary alert, at the top, and other related alerts, called secondary alerts, under the primary alert. When you view alerts, primary alerts stand out by default so you know which alert to focus on without being distracted by the secondary alerts.

    In our example, if a router goes down on your network, network communication is also affected for connected servers, assuming they cannot reach any other routers. The router outage becomes the primary alert and the alerts generated on the server are secondary alerts that are correlated under the router alert.

    그림 2. Secondary alert generation
    Correlated alerts

    Depending on your organization's Event Management implementation, alerts might be grouped automatically based on correlation rules that your administrator sets up. Your instance can also learn how to improve the way it correlates alerts based on these rules. As an operator, you should still verify the accuracy of the correlation and, if necessary, manually correlate additional alerts with the primary alert. Later in the tutorial, you will learn how to do this.

    In this tutorial, you will learn how to manually correlate alerts.

    Alert flapping

    An alert can flap, meaning that it gets multiple open-close events in rapid succession. Flapping indicates that Event Management does not know whether the underlying events are genuine or not. The events could indicate small issues with the way CIs are configured, or larger issues, like network outages.

    그림 3. Alert flapping
    CPU usage

    For example, if a server that hosts a web service has too many active processes, it might trigger an event about excessive CPU usage. Since CPU usage can fluctuate rapidly depending on web service requests, several events might be triggered, leading to the alert being put in the flapping state. As an operator, you might need to create an incident to have the server restarted, or someone might have to reconfigure the CPU, or possibly make a hardware change on the device.

    As another example, consider a loose network cable that causes momentary, repeated network outages. The thresholds that your administrator configures might not be optimal for this kind of alert and Event Management considers it a flapping alert.

    Continue the tutorial

    Proceed to the next lesson: Application services for Event Management operators.