Linux metrics
The following table lists the metrics that are gathered as output from Linux checks. Entries indicated as Featured metrics are high-visibility metrics that are displayed in the Operator Workspace Metric tab after an alert is generated. These metrics provide the operator with additional information to help them further explore the specified issue.
| Metric type | Resource (name of specific database, where relevant) | Units | Metric type description |
|---|---|---|---|
| proc.acc.running | Number of processes running with this name (acc). | ||
| proc.acc.cpuPercent | Percentage of CPU taken by the process. | ||
| proc.acc.memPercent | Percentage of memory taken by the process. |
| Metric type | Resource (name of specific database, where relevant) | Units | Metric type description |
|---|---|---|---|
| reboot.count.today | Number of reboots today. |
| Metric type | Resource (name of specific database, where relevant) | Units | Metric type description |
|---|---|---|---|
| cpu.total.user | Normal processes executing in user mode; cpu.total.user is the total of the cpuN.user metrics. | ||
| cpu.total.nice | Niced processes executing in user mode; cpu.total.nice is the total of the cpuN.nice metrics. | ||
| cpu.total.system | Time the CPU spent running the kernel; cpu.total.system is the total of the cpuN.system metrics. | ||
| cpu.total.idle | Total time the CPU spent in an idle state; cpu.total.idle is the total of the cpuN.idle metrics. | ||
| cpu.total.iowait | Total time the CPU spent waiting for IO operations to complete; cpu.total.iowait is the total of the cpuN.iowait metrics. | ||
| cpu.total.irq | Total time that the processor is spending on Interrupts; cpu.total.irq is the total of the cpuN.irq metrics. | ||
| cpu.total.softirq | Time spent servicing soft interrupt requests; cpu.total.softirq is the total of the cpuN.softirq metrics. | ||
| cpu.total.steal | Total time the virtual CPU spent waiting for the Hypervisor to service another virtual CPU. Applies only to virtual machines. | ||
| cpu.total.guest | Total time the CPU spent running the virtual processor. Applies only to Hypervisors. | ||
| cpu.total.guest_nice | Total time the CPU spent running as nice guest OS; cpu.total.guset_nice is the total of the cpuN.guest_nice metrics | ||
| cpu.<cpu-core>.user | Time spent with normal processing in user mode. | ||
| cpu.<cpu-core>.nice | Time spent with niced processing in user mode. | ||
| cpu.<cpu-core>.system | Time spent running in kernel mode. | ||
| cpu.<cpu-core>.idle | Time spent idle. | ||
| cpu.<cpu-core>.iowait | Time spent waiting for I/O to complete. Also considered idle time. | ||
| cpu.<cpu-core>.irq | Time spent serving hardware interrupts. | ||
| cpu.<cpu-core>.softirq | Time spent serving software interrupts. | ||
| cpu.<cpu-core>.steal | Time stolen by other operating systems running in a virtual environment. | ||
| cpu.<cpu-core>.guest | Time spent running a virtual CPU or guest OS under the control of the kernel. | ||
| cpu.<cpu-core>.guest_nice | Total time the CPU spent running as nice guest OS. | ||
| cpu.intr | Interrupts serviced since boot time. | ||
| cpu.ctxt | Total number of context switches across all CPUs. | ||
| cpu.btime | Boot time. | ||
| cpu.processes | The number of processes and threads created, which includes (but is not limited to) those created by calls to the fork() and clone() system calls. | ||
| cpu.procs_running | The total number of processes running on all CPUs. | ||
| cpu.procs_blocked | The number of processes currently blocked and waiting for I/O to complete. | ||
| cpu.cpu_count | The number of CPUs on the system. | ||
| cpu.<cpu-core>.cores | The number of CPU cores. |
| Metric type | Resource (name of specific database, where relevant) | Units | Metric type description |
|---|---|---|---|
| load_avg.one (featured metric) | The average system load over one minute. | ||
| load_avg.five (featured metric) | The average system load over five minutes. | ||
| load_avg.fifteen (featured metric) | The average system load over fifteen minutes. | ||
| load_avg.norm.one | The average system load over one minute normalized by the number of CPUs. | ||
| load_avg.norm.five | The average system load over five minutes normalized by the number of CPUs. | ||
| load_avg.norm.fifteen | The average system load over fifteen minutes normalized by the number of CPUs. |
| Metric type | Resource (name of specific database, where relevant) | Units | Metric type description |
|---|---|---|---|
| cpu.avgutilization_percentage | The average percentage of CPU used. | ||
| cpu.user_percentage (featured metric) | Percentage of time total CPU was used by normal processes in user mode. | ||
| cpu.nice_percentage (featured metric) | Percentage of time all CPUs used by niced processes in user mode. | ||
| cpu.system_percentage (featured metric) | Percentage of time the CPU spent running the kernel. | ||
| cpu.idle_percentage (featured metric) | Percentage of time all CPIUs were idle. | ||
| cpu.iowait_percentage (featured metric) | Percentage of time all CPUs waited for I/O to complete. | ||
| cpu.irq_percentage (featured metric) | Percentage of time all CPUs serviced interrupts. | ||
| cpu.softirq_percentage (featured metric) | Percentage of time all CPIs serviced software interrupts. | ||
| cpu.steal_percentage (featured metric) | Percentage of time all CPUs serviced virtual-host operating systems. | ||
| cpu.guest_percentage (featured metric) | Percentage of time all CPUs serviced guest operating systems. |
| Metric type | Resource (name of specific database, where relevant) | Units | Metric type description |
|---|---|---|---|
| disk.<disk-name>.reads (featured metric) | Total number of reads completed successfully. | ||
| disk.<disk-name>.readsMerged | Total number of reads merged. | ||
| disk.<disk-name>.sectorsRead | Total number of sectors read successfully. | ||
| disk.<disk-name>.readTime | milliseconds | Total number of milliseconds spent by all reads. | |
| disk.<disk-name>.writes (featured metric) | Total number of writes completed successfully. | ||
| disk.<disk-name>.writesMerged | Total number of writes merged. | ||
| disk.<disk-name>.sectorsWritten | Total number of sectors written successfully. | ||
| disk.<disk-name>.writeTime | milliseconds | Total number of milliseconds spent by all writes. | |
| disk.<disk-name>.ioInProgress | Total number of I/Os currently in progress. | ||
| disk.<disk-name>.ioTime (featured metric) | Total time spent on I/Os. | ||
| disk.<disk-name>.ioTimeWeighted | Total time spent on I/Os. This can provide a measure of both I/O completion time and the backlog that might be accumulating. |
| Metric type | Resource (name of specific database, where relevant) | Units | Metric type description |
|---|---|---|---|
| disk.<file-system-name>.total | The total size of the file system. | ||
| disk.<file-system-name>.used | The total amount of space allocated to existing files in the file system. | ||
| disk.<file-system-name>.avail | The total amount of space available within the file system. | ||
| disk.<file-system-name>.used_percentage | The percentage of the available space that is currently allocated to all files on the file system. | ||
| disk.<file-system-name>.itotal | The total number of inodes on the file system. | ||
| disk.<file-system-name>.iused | The number of used inodes. | ||
| disk.<file-system-name>.iavail | The number of free (unused) inodes. | ||
| disk.<file-system-name>.iused_percentage | The percentage of used inodes. |
| Metric type | Resource (name of specific database, where relevant) | Units | Metric type description |
|---|---|---|---|
| disk_usage.<disk>.total | Total amount of space available on the disk. | ||
| disk_usage.<disk>.used | Total amount of space used in the disk. | ||
| disk_usage.<disk>.avail | Total amount of space available on the disk. | ||
| disk_usage.<disk>.used_percentage (featured metric) | The percentage of space used on the disk. |
| Metric type | Resource (name of specific database, where relevant) | Units | Metric type description |
|---|---|---|---|
| memory.total | Total usable RAM. | ||
| memory.free | Total free RAM. | ||
| memory.available | An estimate of how much memory is available for starting new applications without swapping. | ||
| memory.buffers | Temporary storage used for raw disk blocks. | ||
| memory.cached | In-memory cache for files read from disk (the page cache). Does not include mem_swapcached. | ||
| memory.swapTotal (featured metric) | Total amount of swap space available. | ||
| memory.swapFree (featured metric) | Amount of swap space that is currently unused. | ||
| memory.dirty | Memory which is waiting to be written back to the disk. | ||
| memory.swapUsed (featured metric) | The amount of swap space in use. | ||
| memory.used | The amount of RAM in use. | ||
| memory.usedWOBuffersCaches | The amount of memory in use. | ||
| memory.freeWOBuffersCaches | Value of MemAvailable from /proc/meminfo if present, but falls back to free + buffered + cached memory if not. | ||
| memory.swapUsedPercentage | Percent of swap space used. |
| Metric type | Resource (name of specific database, where relevant) | Units | Metric type description |
|---|---|---|---|
| memory_percent.free (featured metric) | Percentage of free RAM. | ||
| memory_percent.available (featured metric) | Percentage of memory available | ||
| memory_percent.buffers (featured metric) | Percentage of memory used for raw disk blocks. | ||
| memory_percent.cached (featured metric) | Percentage of memory used with in-memory cache for files read from disk. | ||
| memory_percent.dirty (featured metric) | Percentage of memory waiting to be written back to the disk. | ||
| memory_percent.swapUsed (featured metric) | Percentage of swap space used. | ||
| memory_percent.usedWOBuffersCaches (featured metric) | Percentage of memory used. | ||
| memory_percent.freeWOBuffersCaches (featured metric) | Percentage of memory available. |
| Metric type | Resource (name of specific database, where relevant) | Units | Metric type description |
|---|---|---|---|
| system.uptime(sec) | The amount of time the system has been working and available. |
| Metric type | Resource (name of specific database, where relevant) | Units | Metric type description |
|---|---|---|---|
| vmstat.nr_free_pages | Pages that are currently unused by the system. | ||
| vmstat.nr_alloc_batch | Pages allocated to other domains due to insufficient memory in each domain of each non-uniform memory access (NUMA) node. | ||
| vmstat.nr_inactive_anon | Memory pages in each domain of each NUMA node that have not been accessed. | ||
| vmstat.nr_active_anon | Anonymous virtual memory pages that have been recently used. | ||
| vmstat.nr_inactive_file | The memory page corresponding to the file that has not been accessed in each domain of each NUMA. | ||
| vmstat.nr_active_file | The memory page corresponding to the file that has been recently accessed. | ||
| vmstat.nr_unevictable | The number of pages in the unevictable (non-)LRU list. | ||
| vmstat.nr_mlock | Pages mapped into a VM_LOCKED VMA that are a class of unevictable pages. | ||
| vmstat.nr_anon_pages | Memory mapped pages that are not part of a file. | ||
| vmstat.nr_mapped | The number of memory mapped pages. | ||
| vmstat.nr_file_pages | |||
| vmstat.nr_dirty | Pages waiting to be written to disk. | ||
| vmstat.nr_writeback | Pages currently being written to disk. | ||
| vmstat.nr_slab_reclaimable | Pages from the kernel slab memory usage that can be reclaimed. | ||
| vmstat.nr_slab_unreclaimable | Pages from the kernel slab memory usage that cannot be reclaimed. | ||
| vmstat.nr_page_table_pages | Pages allocated to page tables. | ||
| vmstat.nr_kernel_stack | Amount of memory allocated to kernel stacks. | ||
| vmstat.nr_unstable | The number of unstable pages in each domain of each NUMA node. | ||
| vmstat.nr_bounce | |||
| vmstat.nr_vmscan_write | The number of dirty pages written back during a scan of LRUs. | ||
| vmstat.nr_vmscan_immediate_reclaim | |||
| vmstat.nr_writeback_temp | |||
| vmstat.nr_isolated_anon | The number of anonymous memory pages isolated in each domain of each NUMA node. | ||
| vmstat.nr_isolated_file | The number of file storage pages isolated in each domain of each NUMA node. | ||
| vmstat.nr_shmem | The number of shared memory pages. | ||
| vmstat.nr_dirtied | The number of dirty pages in each domain of each NUMA node. | ||
| vmstat.nr_written | |||
| vmstat.numa_hit | The number of pages that were successfully allocated to this node. | ||
| vmstat.numa_miss | The number of pages that were allocated to this node because of low memory on the intended node. | ||
| vmstat.numa_foreign | The number of pages initially intended for this node that were allocated to another node. | ||
| vmstat.numa_interleave | The number of interleave policy pages successfully allocated to this node. | ||
| vmstat.numa_local | The number of pages successfully allocated on this node by a process on this node. | ||
| vmstat.numa_other | The number of pages allocated on this node by a process on another node. | ||
| vmstat.workingset_refault | |||
| vmstat.workingset_activate | |||
| vmstat.workingset_nodereclaim | |||
| vmstat.nr_anon_transparent_hugepages | |||
| vmstat.nr_free_cma | Free continuous memory allocator pages in each domain of each NUMA. | ||
| vmstat.nr_dirty_threshold | |||
| vmstat.nr_dirty_background_threshold | |||
| vmstat.pgpgin | The number of pages brought in from disk. | ||
| vmstat.pgpgout | The number of pages written out to disk. | ||
| vmstat.pswpin | The number of pages brought in from swap space. | ||
| vmstat.pswpout | The number of pages swapped out into swap space. | ||
| vmstat.pgalloc_dma | |||
| vmstat.pgalloc_dma32 | |||
| vmstat.pgalloc_normal | |||
| vmstat.pgalloc_movable | |||
| vmstat.pgfree | The number of pages free since the last boot. | ||
| vmstat.pgactivate | Number of page activations since the last boot. | ||
| vmstat.pgdeactivate | Number of page deactivations since the last boot. | ||
| vmstat.pgfault | Minor faults since the last boot. | ||
| vmstat.pgmajfault | Major faults since the last boot. | ||
| vmstat.pglazyfreed | |||
| vmstat.pgrefill_dma | |||
| vmstat.pgrefill_dma32 | |||
| vmstat.pgrefill_normal | Number of page refills since the last boot. | ||
| vmstat.pgrefill_movable | |||
| vmstat.pgsteal_kswapd_dma | |||
| vmstat.pgsteal_kswapd_dma32 | |||
| vmstat.pgsteal_kswapd_normal | |||
| vmstat.pgsteal_kswapd_movable | |||
| vmstat.pgsteal_direct_dma | |||
| vmstat.pgsteal_direct_dma32 | |||
| vmstat.pgsteal_direct_normal | |||
| vmstat.pgsteal_direct_movable | |||
| vmstat.pgscan_kswapd_dma | |||
| vmstat.pgscan_kswapd_dma32 | |||
| vmstat.pgscan_kswapd_normal | Number of pages scanned by kswapd since boot. | ||
| vmstat.pgscan_kswapd_movable | |||
| vmstat.pgscan_direct_dma | |||
| vmstat.pgscan_direct_dma32 | |||
| vmstat.pgscan_direct_normal | Number of pages reclaimed since boot. | ||
| vmstat.pgscan_direct_movable | |||
| vmstat.pgscan_direct_throttle | |||
| vmstat.zone_reclaim_failed | |||
| vmstat.pginodesteal | |||
| vmstat.slabs_scanned | |||
| vmstat.kswapd_inodesteal | |||
| vmstat.kswapd_low_wmark_hit_quickly | |||
| vmstat.kswapd_high_wmark_hit_quickly | |||
| vmstat.pageoutrun | Number of times kswapd called page reclaim. | ||
| vmstat.allocstall | Number of times page reclaim was called directly (low memory). | ||
| vmstat.pgrotated | |||
| vmstat.drop_pagecache | |||
| vmstat.drop_slab | |||
| vmstat.numa_pte_updates | |||
| vmstat.numa_huge_pte_updates | |||
| vmstat.numa_hint_faults | |||
| vmstat.numa_hint_faults_local | |||
| vmstat.numa_pages_migrated | |||
| vmstat.pgmigrate_success | |||
| vmstat.pgmigrate_fail | |||
| vmstat.compact_migrate_scanned | |||
| vmstat.compact_free_scanned | |||
| vmstat.compact_isolated | |||
| vmstat.compact_stall | The number of times a process stalls when running memory compaction so that a huge page is free for use. | ||
| vmstat.compact_fail | The number of times the system attempted to compact memory but failed. | ||
| vmstat.compact_success | The number of times the system compacted memory and freed a huge page for use. | ||
| vmstat.htlb_buddy_alloc_success | |||
| vmstat.htlb_buddy_alloc_fail | |||
| vmstat.unevictable_pgs_culled | |||
| vmstat.unevictable_pgs_scanned | |||
| vmstat.unevictable_pgs_rescued | |||
| vmstat.unevictable_pgs_mlocked | |||
| vmstat.unevictable_pgs_munlocked | |||
| vmstat.unevictable_pgs_cleared | |||
| vmstat.unevictable_pgs_stranded | |||
| vmstat.thp_fault_alloc | The number of huge pages successfully allocated to handle a page fault. | ||
| vmstat.thp_fault_fallback | The number of page fault fails to allocate a huge page before falling back to using small pages. | ||
| vmstat.thp_collapse_alloc | The number of pages collapsed into one huge page with the successful allocation of a new huge page to store the data. | ||
| vmstat.thp_collapse_alloc_failed | The number of pages collapsed into one huge page but failed allocation. | ||
| vmstat.thp_split | The number of base pages to split from a huge page. | ||
| vmstat.thp_zero_page_alloc | The number of successful allocations of huge zero pages. | ||
| vmstat.thp_zero_page_alloc_failed | The number of times the kernel failed to allocate a huge zero page and falls back to using small pages. | ||
| vmstat.balloon_inflate | |||
| vmstat.balloon_deflate | |||
| vmstat.balloon_migrate |
| Metric type | Resource (name of specific database, where relevant) | Units | Metric type description |
|---|---|---|---|
| proc.<process>.VmSize | The total amount of virtual memory used by the process. | ||
| proc.<process>.VmRSS | The non-swapped physical memory used by a process. | ||
| proc.<process>.VmSwap | The total amount of swap space used. |