What is anomaly detection?

Anomaly detection uses a series of tools to identify and address anomalies as they appear in a set of data.

A change within a data pattern, an outlier, or an event that falls outside of a standard trend. A deviation from something expected or something that doesn’t conform to expectations.

An anomaly, or an outlier in a pattern, can be indicative of something that falls outside of the norm or something that is possibly not right.

Point/global anomalies

A single point of data that has been identified as too far off from the rest.

Contextual anomalies

An anomaly that is abnormal in the context of one data set, but normal in the context of another data set. This is the most common type of contextual anomaly in time series data.

Collective anomalies

When an entire subset of data is anomalous when compared to a wider set of data—individual data points aren’t a consideration when identifying collective anomalies.

The identification of a rare outlier or a data point outside of the trends of a set of data. Anomalies can be indicative of suspicious events, malfunctions, defects, or fraud.

The challenge of anomaly detection

An anomaly detection system either takes manual labour to analyse or requires the use of machine learning (ML). This can be challenging, as it requires strong domain knowledge and the difficult necessity of predicting possible statistical anomalies before they manifest.

Anomaly detection with machine learning

Benefits of anomaly detection and machine learning

Machine learning (ML) works better for anomaly detection, as it is more timely than manual detection, highly adaptive to changes, and has the capabilities to easily handle large data sets.

Unstructured data

Data that comes structured has a foundation of understanding and meaning behind the data—it has been interpreted and organised into a digestible data set. Encoded or unstructured data can render an algorithm useless until it is structured, as there is little interpretation and understanding of the context of the data.

Large datasets needed

A good set of data to be analysed needs to be large enough to establish a good trend and identify proper anomalies. This benefits detection because more valid inferences cannot be made from a smaller set of data, whereas a larger set of data can show something to be an anomaly rather than something that could be part of a trend or that is not as much of an outlier as predicted.

Talent required

Knowledgeable engineers or data scientists are needed to train a machine learning algorithm. Depending on the solution capabilities, it can take a couple weeks or months to train the machine, and as well depending on the solution, different levels of Machine Learning skill are required.

Anomaly detection in three settings

Supervised

Data that is supervised comes pre-prepared with each of the data points labelled as “nominal” or “anomaly”. All anomalies are identified ahead of time for the model to train on.

Clean

All data points are labelled “nominal”, and “anomaly” points have not been labelled. Clean data leaves the role of detecting anomalies to the data modeller, as all data points within the clean set are presumed to be “nominal”.

Unsupervised

Unsupervised data arrives with no “nominal” or “anomaly” points labelled. It is up to the data modeller to determine the points that are “nominal” and “anomaly”—there is no foundation or understanding of what the accurate outcome may be.

The process of identifying a pattern that has not been observed within a new observation not contained in training data.

The easiest approach to detecting an anomaly is to identify something irregular within a data spread that seems to deviate from a trend or common statistical distributions like mean, median, and mode.

How to use machine learning for anomaly detection and condition monitoring.

Digital transformation

Also known as digitalisation and Industry 4.0, a digital transformation uses technology and data to streamline productivity and increase efficiency. Data is increasingly abundant as machines and devices are more and more connected and capable of transferring a wealth of data to countless places. The goal is to then extract and analyse information from this data to reduce costs and downtime. Machine learning and data analytics play a large role in this.

Condition monitoring

Every machine, regardless of its respective complexities, will reach a state of bad health. This doesn’t indicate that a machine has reached the end of its life or that it must shut down, but rather that it could be in need of maintenance to restore the machine to its full and optimal performance. Having a large data set to analyse can reveal anomalies that may predict or signal when a machine is in need of maintenance or replacement.

Density-based approaches

Density-based anomaly detection

Density-based anomaly detection functions under the assumption that all nominal data points are located close together, and anomalies are located further away. It is based on the k-nearest neighbours (k-NN) algorithm, which is simple and not parametric. k-NN is typically used to classify data based on its similarities in distance measurements like Manhattan, Minkowski, Hamming, or Euclidean.

Clustering-based anomaly detection

Clustering is based on the assumption that similar data points tend to belong to similar clusters or groups, and that is determined by their distance from local centroids (the average of all points). The k-means clustering algorithm creates “k” clusters of similar data points. Anomalies are any points that fall outside of the “k” clusters.

Support vector machine (SVM)-based anomaly detection

SVM usually uses supervised learning, but there are options that can also identify anomalies in unsupervised learning environments. A soft boundary is learnt and applied to the training set, normal data instances are clustered within the boundary, and anomalies are identified as abnormalities that fall outside of the learnt boundary.

Time series data is a sequence of values that is collected over time. Each data point may have two metrics: the time and date of when the data point was collected, and the value of that data point. Data is continually gathered and is mainly used to predict events in the future rather than serve as a projection in and of itself. Time series anomalies can be used to detect:

  1. Active users
  2. Web page views
  3. CPC
  4. CPL
  5. Bounce rate
  6. Churn rate
  7. Average order value
  8. Mobile app installations

Time series anomaly detection establishes a baseline for typical behaviour in identified KPIs.

  • Data cleaning
  • Introduction detection
  • Fraud detection
  • Systems health monitoring
  • Event detection in sensor networks
  • Ecosystem disturbances

Anomaly detection for service performance

A reactive approach to detection can result in downtime and performance issues which create consequences before there is a solution. Detecting anomalies in performance can help companies predict when and why there could be an issue within a business service. Most industries can benefit. For example, here are two industries that can benefit:

  • Telco: Telecom analytics produce enormous sets of data, and advanced solutions are important to detect and prevent latency, jitter, and bad call quality that can lower performance.
  • Adtech: Complex application performance can be difficult to monitor due to the speed at which transactions occur within an ad action. Anomaly detection can look for issues in an application before the application crashes, thus preventing downtime during an ad action.

Anomaly detection for product quality

Products need to run smoothly and with as little error as possible. The natural evolution of products can result in behavioural anomalies in everything from a new feature to an A/B test, and ongoing monitoring of any behavioural anomalies can prevent downtime or ongoing issues. While most industries can benefit, here are just two examples:

  • eCommerce: Anomaly detection can look for any strange behaviour or product quality issues like price glitches or abnormal changes in seasonality.
  • Fintech: The financial industry trades in milliseconds, and there needs to be certainty that the applications overseeing trades are secure and consistent. Anomaly detection can prevent downtime or glitches by watching for anything abnormal in application performance and operations.

Anomaly detection for user experience

A user experience can be negative if a site experiences service degradation. Anomaly detection can help companies react to any lapses before they frustrate customers and lead to a loss of revenue. There are a few industries that can benefit from anomaly detection in this manner:

  • Gaming: Games are complicated, which makes manual monitoring of the permutational complexities near impossible. Artificial intelligence (AI) can counteract glitches and errors in a user experience.
  • Online business: Online businesses rely heavily on UX for success. The IT team needs to watch for and mitigate API errors, server downtime, and load-time glitches. Rapid root cause analysis through anomaly detection can quickly pinpoint an issue to help platforms, data centres, and operating systems receive repairs with little to no downtime.

  • Automated anomaly detection provides accurate insights in real time while providing ranking, detection, and grouping of data. This eliminates the need for a larger team of data analysts.
  • Supervised and unsupervised machine learning: Machine learning ideally occurs without supervision or human interaction. But, there should still be a few analysts to feed baseline data and occasionally monitor the machine learning program.
  • Hybrid: Scaled anomaly detection that provides the flexibility of manual rule making for specific anomalies.

With anomaly detection, you have to ask the question: does one build a solution, or does one buy a system? There are a few important things to consider in the decision making process:

  • The size of the company
  • The volume of the data that is going to be processed
  • Capacity for internal development
  • Any plans for expansion
  • Demands of stakeholders
  • Budget demands
  • The size of a team that is available
  • Internal data science expertise

Capabilities that scale with your business

Foresee problems before they arise with ServiceNow.

Loading spinner
Contact
Demo