An anomaly represents any change within a data pattern, often appearing as statistical outliers or abnormal sequences. Anomaly detection is all about identifying the outliers—data points and observations that significantly deviate from what is typical and expected. These anomalies, which are rare and do not conform to usual trends, can indicate issues like fraud or system malfunctions.
Of course, it’s important to differentiate an anomaly from a novelty. An anomaly is a deviation from established norms, often signaling a problem. In contrast, novelty detection involves identifying patterns that were not observed during the model’s training phase, focusing on new data points that could indicate emerging trends.
Both anomaly and novelty detection are crucial for monitoring and responding to data as they provide insights into either potential disruptions or new developments. These methods are especially valuable in cyber security, where they can preemptively address deviations from expected patterns.
There are three types of anomalies identified in anomaly detection systems, each of which is crucial for understanding different patterns that deviate from what is expected. Each type has distinct characteristics and requires different analytical strategies and tools to identify and manage.
Point/global anomalies
These refer to a single point of data that has been identified as too far off from the rest. Global anomalies are easy to identify because they are significantly different from the entire dataset.
Contextual anomalies
Also known as conditional outliers, this type is an anomaly that is abnormal in the context of one dataset, but normal in the context of another. This is the most common type of contextual anomaly in time series data, where the context is defined by time.
Collective anomalies
These anomalies occur when an entire subset of data is anomalous compared to a wider set of data—individual data points are not a consideration when identifying collective anomalies. This type is often observed when analyzing complex patterns across different datasets.
Anomaly detection can be traced back to the early 19th century when statisticians first grappled with outliers in their data. Initially, these outliers were simply considered errors or extra noise, and the methods to deal with them were rudimentary—mainly relying on basic statistical tools to flag deviations.
The field advanced significantly during the mid-20th century with the advent of computers. As technology progressed, it became possible to process larger datasets and apply more complex algorithms for identifying anomalies. The rise of machine learning (ML) has further transformed anomaly detection, introducing sophisticated methods like neural networks, support vector machines, and clustering. These innovations have automated and refined the process, making anomaly detection a critical tool across industries such as finance and healthcare.
Anomaly detection is crucial across various industries, such as finance, retail, and cybersecurity, but its importance extends to virtually all businesses. This technology automates the identification of outliers that could be harmful, thereby safeguarding valuable data. Since it plays an important role in identifying irregular patterns, anomaly detection can help protect the data of both organizations and customers.
Data is often described as the lifeblood of modern businesses, and compromising it can have severe consequences. Without effective anomaly detection mechanisms in place, companies risk losing revenue and the hard-earned trust of their customers. The impact of such losses extends beyond immediate financial damage—businesses face the potential loss of brand equity and customer loyalty, which are much harder to restore.
In fact, the loss of sensitive customer information due to inadequate anomaly detection can lead to a breach of trust that is difficult, if not impossible, to recover. Customers expect their data to be handled securely and responsibly, and failing in this aspect can lead to long-term reputational damage. Anomaly detection helps maintain the integrity of customer data and by extension, preserves the trust customers place in a business.
An anomaly detection system either takes manual labor to analyze or requires the use of machine learning (ML). This can be challenging, as it requires strong domain knowledge and the difficult necessity of predicting possible statistical anomalies before they manifest.
One of the primary challenges of implementing an effective anomaly detection strategy is the initial lack of familiarity with the technology among most organizations, which can make the scaling process complex and resource intensive.
Maintaining the integrity of the detection efforts once the system is in place is another ongoing challenge. It requires continuous monitoring and adjustment as data patterns evolve and new types of anomalies emerge. This dynamic aspect of anomaly detection demands constant vigilance and adaptability from businesses.
Machine learning has revolutionized the way anomalies are detected by enabling systems to learn from data, identify patterns, and predict future outcomes. This is particularly effective in environments where data complexity and volume exceed human analytical capabilities. ML models, once trained on a dataset, can efficiently process large volumes of information and detect deviations from established patterns without explicit programming.
Unstructured data
Data that comes structured has a foundation of understanding and meaning behind the data—it has been interpreted and organized into a digestible data set. Encoded or unstructured data can render an algorithm useless until it is structured, as there is little interpretation and understanding of the context of the data.
Large datasets
A good set of data to be analyzed needs to be large enough to establish a good trend and identify proper anomalies. Detection can benefit because more valid inferences cannot be made from a smaller set of data, and a larger set of data can reveal an anomaly rather than something that could be part of a trend or is not as much of an outlier as predicted.
Anomaly detection techniques primarily revolve around identifying deviations from expected patterns within data distributions. These techniques can be classified into supervised and unsupervised methods, each leveraging different approaches to discern anomalies.
Supervised
Supervised techniques require a labeled dataset, where the outcomes are already known, to train the model to identify anomalies. This approach is particularly effective when there is plenty of historical data and anomalies are previously identified. Key techniques under supervised anomaly detection include:
- Density-based detection: Density-based anomaly detection functions under the assumption that all nominal data points are located close together, and anomalies are located farther away. It is based on the k-nearest neighbors (k-NN) algorithm that is simple and not parametric. k-NN is typically used to classify data based on its similarities in distance measurements like Manhattan, Minkowski, Hamming, or Euclidean.
- Support Vector Machine detection: SVM usually uses supervised learning, but there are options that can also identify anomalies in unsupervised learning environments. A soft boundary is learned and applied to the training set, normal data instances are clustered within the boundary, and anomalies are identified as abnormalities that fall outside of the learned boundary.
Unsupervised
Unsupervised techniques do not require labeled data, making them suitable for scenarios where anomalies are not previously known or labeled. This flexibility allows for broader application but with potentially less precision than supervised methods.
- Clustering-based detection: Clustering is based on the assumption that similar data points tend to belong to similar clusters or groups, and that is determined by their distance from local centroids (the average of all points). The clustering algorithm k-means creates “k” clusters of similar data points. Anomalies are any points that fall outside of the “k” clusters.
Time series data is a sequence of values that is collected over time. Each data point may have two metrics: the time and date of when the data point was collected, and the value of that data point. Data is continually gathered and is mainly used to predict events in the future rather than serve as a projection in and of itself. Time series anomalies can be used to detect:
- Active users
- Web page views
- CPC
- CPL
- Bounce rate
- Churn rate
- Average order value
- Mobile app installations
Time series anomaly detection establishes a baseline for typical behavior in identified KPIs.
- Data cleaning
- Introduction detection
- Fraud detection
- Systems health monitoring
- Event detection in sensor networks
- Ecosystem disturbances
Anomaly detection for service performance
A reactive approach to detection can result in downtime and performance issues which create consequences before there is a solution. Detecting anomalies in performance can help companies predict when and why there could be an issue within a business service. Most industries can benefit. For example, here are two industries that can benefit:
- Telco: Telecom analytics produce enormous sets of data, and advanced solutions are important to detect and prevent latency, jitter, and bad call quality that can lower performance.
- Adtech: Complex application performance can be difficult to monitor due to the speed at which transactions occur within an ad auction. Anomaly detection can look for issues in an application before the application can crash, thus preventing downtime during an ad action.
Anomaly detection for system performance
Systems need to run smoothly and with as little error as possible. Anomaly detection helps teams identify and resolve performance issues early on. While this is applicable in many different industries, two common examples are:
- Telecommunications: Telecom providers manage vast networks where even minor anomalies can lead to service disruptions, dropped calls, latency, or simply poor voice quality. Anomaly detection helps locate areas of network congestion, as well as signs of hardware failures or signal interference, so providers can maintain consistently high service quality.
- Fintech: The financial industry trades in milliseconds, and there needs to be certainty the applications overseeing trades are secure and consistent. Anomaly detection can prevent downtime or glitches by watching for anything abnormal in application performance and operations.
Anomaly detection for user experience
A user experience can be negative if a site experiences service degradation. Anomaly detection can help companies react to any lapses before they frustrate customers and lead to a loss of revenue. A few industries can benefit from anomaly detection in this manner:
- Gaming: Games are complicated, which makes manual monitoring of the permutational complexities nearly impossible. Artificial intelligence (AI) can counteract glitches and errors in a user experience, like glitches.
- Online business: Online businesses rely heavily on UX for success. The IT team needs to watch for and mitigate API errors, server downtime, and load-time glitches. Rapid root cause analysis through anomaly detection can quickly pinpoint an issue to help platforms, data centers, and operating systems receive repairs with little to no downtime.
When building an anomaly detection system, there are several crucial considerations to ensure it meets the specific needs of your business:
- Timeliness: Determine the required speed for anomaly detection. Is real-time analysis necessary, or can the system afford to identify anomalies on a delayed basis, such as daily, weekly, or monthly? The urgency of response will guide the architectural and processing choices.
- Scale: Assess the volume of data that the system needs to handle. Will it need to process hundreds, thousands, or millions of metrics? The scale of data affects the computational strategies and technologies deployed.
- Conciseness: Decide whether the system should provide a holistic view of anomalies across various metrics or if detecting anomalies within individual metrics is sufficient. This will influence the complexity and design of the detection algorithms.
Anomaly detection needs beg the question: does one build a solution, or does one buy a system? There are a few important things to consider in the decision-making process:
- The size of the company
- The volume of the data that is going to be processed
- Capacity for internal development
- Any plans for expansion
- Demands of stakeholders
- Budget demands
- The size of a team that is available
- Internal data science expertise
ServiceNow provides powerful tools for businesses looking to enhance their anomaly detection capabilities. This is made possible thanks to the Now Platform®, a unified system capable of integrating predictive AIOps and automation to monitor IT infrastructure in real time. By providing full visibility into multistack environments—on-premises, cloud, and serverless—ServiceNow helps organizations identify anomalies in performance metrics and quickly address potential disruptions.
ServiceNow IT Operations Management (ITOM) takes this even further. ITOM’s Metric Intelligence application enables proactive analysis of IT infrastructure to detect performance degradation and prevent outages. Automatically set dynamic thresholds using machine learning to reduce manual configuration and quickly pinpoint anomalies. Apply Metric Explorer to visualize resource performance across configuration items (CIs), including cloud-native containers Use the Time Series Database for long-term data storage, allowing for deeper analysis and cross-team collaboration on historical trends. ServiceNow makes it all possible.
Minimize service disruptions, resolve incidents faster, and maintain consistent system performance, with ITOM from ServiceNow. Request a demo today!