benny_makovsky
ServiceNow Employee
ServiceNow Employee

In this blog, we explore how GenAI is transforming the AIOps landscape. We delve into the integration of GenAI with Health Log Analytics (HLA), showcasing how it enhances alert analysis and operational efficiency. Discover the multiple layers of GenAI's impact, from detecting problems to automating their resolution.

 

Introduction to GenAI in AIOps

GenAI is revolutionizing the tech landscape with its sophisticated capabilities. Its applications are incredibly practical and can significantly enhance our productivity in ways previously unimaginable. Specifically, in the realm of AIOps, this transformative technology is taking productivity to new heights.

 

Health Log Analytics: The Foundation

Our focus today is on a breakthrough solution within AIOps – Health Log Analytics. This innovative product automatically analyzes anomalous behavior in logs, identifying unusual patterns through machine learning. By proactively detecting these anomalies, Health Log Analytics ensures early issue resolution, making it a game changer for operational efficiency.

 

Adding a Layer of Insight with GenAI

While Health Log Analytics (HLA) is based on existing algorithms, detailed in a separate blog, the integration of GenAI adds an extra layer of insight. One of our latest features, Alert Analysis, leverages GenAI to provide a deeper understanding of alerts generated by the system, not limited to HLA.

 

Understanding Alerts with GenAI

Alert Analysis examines the alert context and translates complex issues into simple English summaries. It offers concise explanations of the problem, potential recommendations for resolution, and can even identify possible root causes. This capability extends to both regular alerts, including events and metrics, ensuring comprehensive monitoring and management.

 

For instance, here are two examples demonstrating the effectiveness of Alert Analysis:

 

  1. Before Analysis:

In the first image, we see a group of alerts identified in the logs. These alerts contain technical error messages and codes, making them complex and difficult to understand at a glance. For example, the errors include terms like "ExtHandler Error" and "ProtocolError," which are not easily interpretable by those without specific domain knowledge. This raw data is overwhelming and requires significant expertise to interpret.

 

Service_Operations_Workspace___ServiceNow.png

  1. After Analysis:

In the second image, the Alert Analysis feature has processed the same group of alerts using Now Assist. It provides a clear summary and analysis of the issues, translating the technical jargon into plain English. The summary explains that the alerts are related to "ACC Linux Server: ProtocolError and ResourceGoneError," and suggests that these errors might be due to a misconfiguration or unexpected behavior in the Linux Server Component. This makes the information accessible even to non-experts and helps quickly identify the root cause and potential solutions.

 

Service_Operations_Workspace___ServiceNow.png

These examples clearly illustrate how GenAI simplifies and enhances the understanding of complex alerts, allowing for quicker and more efficient issue resolution.

 

Significance of Alert Analysis in HLA

In the context of HLA, the advantage of Alert Analysis becomes even more significant. Often, alerts generated from logs can be highly complex or detailed, specifically relating to applications. This complexity can make it challenging for those who are not domain experts or system specialists to understand. The alerts might only make sense to developers, posing a challenge for operators.

 

Moreover, HLA's capability to group multiple anomalous alerts together is further enhanced by GenAI. It can analyze several simultaneous anomalies in the logs and provide a comprehensive explanation in simple and intuitive English. This helps in understanding the nature and implications of specific issues or a collection of issues exposed in the logs. With clear and straightforward language, even non-experts can grasp the problem quickly.

 

Proactive Problem Resolution with GenAI

The logical next step in integrating GenAI within AIOps is leveraging it for deeper analysis and proactive support. Once we've explained the issues in simple terms and identified correlations between logs, the next phase involves using GenAI to gather additional supportive information. This includes fetching more logs, metrics, and potentially related alerts.

 

Furthermore, GenAI can utilize its broad contextual understanding to provide supportive information such as related incidents or changes, and even external sources like articles or documentation. The ultimate goal is to automate the remediation process based on this comprehensive data. Initially, this might involve quick fixes, and over time, it can evolve into creating automated workflows that address issues thoroughly.

 

Conclusion: The Layers of GenAI in AIOps

Thus, the use of GenAI in AIOps spans multiple layers: first, using standard AI to detect problems, then using GenAI to understand and explain these issues, and finally, automating the resolution process. Each of these layers enhances our ability to be proactive, reducing the need to wait for problems to arise before addressing them.

 

 

1 Comment