Identifying system issues with synthetic monitoring
Summarize
Summary of Identifying System Issues with Synthetic Monitoring
Synthetic monitoring enables service owners and operators to assess the performance and availability of service endpoints. It allows for quick identification of issues through notifications for test failures, and helps in analyzing trends in HTTP API success rates and response times.
Show less
Key Features
- Monitor Overview: The synthetic monitoring landing page provides an aggregate view of all monitors, including their statuses. Users can filter and sort monitors based on their states, such as failed or unknown.
- Monitor Details: Selecting a monitor reveals its performance metrics, including test results, configuration parameters, and health status. This aids in understanding the monitor's effectiveness.
- Metrics Visualization: The Metrics card features two charts: one for failed tests and another for response times, allowing users to identify performance issues visually.
- Test History: A table listing each test's results is available, sortable by various criteria to facilitate issue identification.
Key Outcomes
Utilizing synthetic monitoring helps ensure service reliability by enabling proactive issue detection and resolution. By quickly accessing performance metrics and historical data, service owners and operators can effectively triage problems and maintain optimal service performance.
Tests run by synthetic monitors enable service owners and operators to view service endpoint performance at scale.
Overview of using synthetic monitoring
As a service owner, you can use synthetic monitoring to monitor your service's endpoints, verifying that they're available and performing as expected. You can be notified when synthetic tests fail, enabling you to mitigate issues quickly. You can see trends in HTTP API success rates and response times.
As an operator, you can use synthetic monitoring as part of triaging issues. When you learn a service is reported to have issues, you can view test results for that service's endpoints. If tests are failing or response times are slow, that might be something to investigate.
View aggregate information about the monitors
The synthetic monitoring landing page shows an overview of all created monitors, including inactive monitors.
From here, you can see the status for all your monitors. Selecting a card at the top filters the list of monitors. For example, you can view only the monitors that have failed or that are in an unknown state.
By default, the list of monitors is sorted by the timestamp in the Updated column. You can select a different column header to sort by that category.
Selecting a monitor lets you view details about the monitor's tests, including details, configuration, and associated configuration items (CIs).
View a monitor and its tests
To view a monitor's details and the results of its tests, select a monitor from the synthetic monitoring landing page. The Monitor details page provides key information needed to understand how the monitor is performing.
Use the information in the header to understand the basic health of the monitor, including its status and when it last ran a test.
Use the Configuration card to understand the basic configuration parameters of the monitor. If a monitor is in an unknown state, you can use the endpoint link to verify that the correct endpoint is selected
for the monitor. An unknown state often occurs due to an issue with the related Agent Client Collector (ACC) proxy.
View individual tests
The Metrics card displays two charts to help you understand the health of each test the monitor has run. The Failed tests chart displays each test with a value of 0 when it was
successful and a value of 1 when it failed. Hover over a point on the chart to view further details.
The Response time chart shows the amount of time in milliseconds that it took to receive a response from the endpoint. Hover over a point on the chart to view details.
The Monitor result history table lists each test. By default, the table is sorted from newest to oldest by the timestamp. You can sort by any column to help find issues. For example, you can sort by Result to see all failures together or you can sort by Response time to view the tests with the highest latency.