Akamas Docs
3.5
3.5
  • Home
  • Getting started
    • Introduction
    • Free Trial
    • Licensing
    • Deployment
      • Cloud Hosting
    • Security
    • Maintenance & Support (M&S) Services
      • Customer Support Services
      • Support levels for Customer Support Services
      • Support levels for software versions
      • Support levels with Akamas
  • Installing
    • Architecture
    • Docker compose installation
      • Prerequisites
        • Hardware Requirements
        • Software Requirements
        • Network requirements
      • Install Akamas dependencies
      • Install the Akamas Server
        • Online installation mode
          • Online installation behind a Proxy server
        • Offline installation mode
        • Changing UI Ports
        • Setup HTTPS configuration
      • Troubleshoot Docker installation issues
    • Kubernetes installation
      • Prerequisites
        • Cluster Requirements
        • Software Requirements
      • Install Akamas
        • Online Installation
        • Offline Installation - Private registry
      • Installing on OpenShift
      • Accessing Akamas
      • Useful commands
    • Install the CLI
      • Setup the CLI
      • Initialize the CLI
      • Change CLI configuration
      • Use a proxy server
    • Verify the installation
    • Installing the toolbox
    • Install the license
    • Manage anonymous data collection
  • Managing Akamas
    • Akamas logs
    • Audit logs
    • Upgrade Akamas
      • Docker compose
      • Kubernetes
    • Monitor Akamas status
    • Backup & Recover of the Akamas Server
    • Users management
      • Accessing Keycloak admin console
      • Configure an external identity provider
        • Azure Active Directory
        • Google
      • Limit users sessions
        • Local users
        • Identity provider users
    • Collecting support information
  • Using
    • System
    • Telemetry
    • Workflow
    • Study
      • Offline Study
      • Live Study
        • Analyzing results of live optimization studies
      • Windowing
      • Parameters and constraints
  • Optimization Guides
    • Optimize application costs and resource efficiency
      • Kubernetes microservices
        • Optimize cost of a Kubernetes deployment subject to Horizontal Pod Autoscaler
        • Optimize cost of a Kubernetes microservice while preserving SLOs in production
        • Optimize cost of a Java microservice on Kubernetes while preserving SLOs in production
      • Application runtime
        • Optimizing a sample Java OpenJDK application
        • Optimizing cost of a Node.js application with performance tests
        • Optimizing cost of a Golang application with performance tests
        • Optimizing cost of a .NET application with performance tests
      • Applications running on cloud instances
        • Optimizing a sample application running on AWS
      • Spark applications
        • Optimizing a Spark application
    • Optimize application performance and reliability
      • Kubernetes microservices
        • Optimizing cost of a Kubernetes microservice while preserving SLOs in production
        • Optimizing cost of a Java microservice on Kubernetes while preserving SLOs in production
      • Applications running on cloud instances
      • Spark applications
  • Integrating
    • Integrating Telemetry Providers
      • CSV provider
        • Install CSV provider
        • Create CSV telemetry instances
      • Dynatrace provider
        • Install Dynatrace provider
        • Create Dynatrace telemetry instances
          • Import Key Requests
      • Prometheus provider
        • Install Prometheus provider
        • Create Prometheus telemetry instances
        • CloudWatch Exporter
        • OracleDB Exporter
      • Spark History Server provider
        • Install Spark History Server provider
        • Create Spark History Server telemetry instances
      • NeoLoadWeb provider
        • Install NeoLoadWeb telemetry provider
        • Create NeoLoadWeb telemetry instances
      • LoadRunner Professional provider
        • Install LoadRunner Professional provider
        • Create LoadRunner Professional telemetry instances
      • LoadRunner Enterprise provider
        • Install LoadRunner Enterprise provider
        • Create LoadRunner Enterprise telemetry instances
      • AWS provider
        • Install AWS provider
        • Create AWS telemetry instances
    • Integrating Configuration Management
    • Integrating with pipelines
    • Integrating Load Testing
      • Integrating NeoLoad
      • Integrating LoadRunner Professional
      • Integrating LoadRunner Enterprise
  • Reference
    • Glossary
      • System
      • Component
      • Metric
      • Parameter
      • Component Type
      • Workflow
      • Telemetry Provider
      • Telemetry Instance
      • Optimization Pack
      • Goals & Constraints
      • KPI
      • Optimization Study
      • Workspace
      • Safety Policies
    • Construct templates
      • System template
      • Component template
      • Parameter template
      • Metric template
      • Component Types template
      • Telemetry Provider template
      • Telemetry Instance template
      • Workflows template
      • Study template
        • Goal & Constraints
        • Windowing policy
          • Trim windowing
          • Stability windowing
        • Parameter selection
        • Metric selection
        • Workload selection
        • KPIs
        • Steps
          • Baseline step
          • Bootstrap step
          • Preset step
          • Optimize step
        • Parameter rendering
        • Optimizer Options
    • Workflow Operators
      • General operator arguments
      • Executor Operator
      • FileConfigurator Operator
      • LinuxConfigurator Operator
      • WindowsExecutor Operator
      • WindowsFileConfigurator Operator
      • Sleep Operator
      • OracleExecutor Operator
      • OracleConfigurator Operator
      • SparkSSHSubmit Operator
      • SparkSubmit Operator
      • SparkLivy Operator
      • NeoLoadWeb Operator
      • LoadRunner Operator
      • LoadRunnerEnteprise Operator
    • Telemetry metric mapping
      • Dynatrace metrics mapping
      • Prometheus metrics mapping
      • NeoLoadWeb metrics mapping
      • Spark History Server metrics mapping
      • LoadRunner metrics mapping
    • Optimization Packs
      • Linux optimization pack
        • Amazon Linux
        • Amazon Linux 2
        • Amazon Linux 2022
        • CentOS 7
        • CentOS 8
        • RHEL 7
        • RHEL 8
        • Ubuntu 16.04
        • Ubuntu 18.04
        • Ubuntu 20.04
      • DotNet optimization pack
        • DotNet Core 3.1
      • Java OpenJDK optimization pack
        • Java OpenJDK 8
        • Java OpenJDK 11
        • Java OpenJDK 17
      • OpenJ9 optimization pack
        • IBM J9 VM 6
        • IBM J9 VM 8
        • Eclipse Open J9 11
      • Node JS optimization pack
        • Node JS 18
      • GO optimization pack
        • GO 1
      • Web Application optimization pack
        • Web Application
      • Docker optimization pack
        • Container
      • Kubernetes optimization pack
        • Kubernetes Pod
        • Kubernetes Container
        • Kubernetes Workload
        • Kubernetes Namespace
        • Kubernetes Cluster
      • WebSphere optimization pack
        • WebSphere 8.5
        • WebSphere Liberty ND
      • AWS optimization pack
        • EC2
        • Lambda
      • PostgreSQL optimization pack
        • PostgreSQL 11
        • PostgreSQL 12
      • Cassandra optimization pack
        • Cassandra
      • MySQL Database optimization pack
        • MySQL 8.0
      • Oracle Database optimization pack
        • Oracle Database 12c
        • Oracle Database 18c
        • Oracle Database 19c
        • RDS Oracle Database 11g
        • RDS Oracle Database 12c
      • MongoDB optimization pack
        • MongoDB 4
        • MongoDB 5
      • Elasticsearch optimization pack
        • Elasticsearch 6
      • Spark optimization pack
        • Spark Application 2.2.0
        • Spark Application 2.3.0
        • Spark Application 2.4.0
    • Command Line commands
      • Administration commands
      • User and Workspace management commands
      • Authentication commands
      • Resource management commands
      • Optimizer options commands
    • Release Notes
  • Knowledge Base
    • Creating custom optimization packs
    • Setting up a Konakart environment for testing Akamas
    • Modeling a sample Java-based e-commerce application (Konakart)
    • Optimizing a web application
    • Optimizing a sample Java OpenJ9 application
    • Optimizing a sample Linux system
    • Optimizing a MongoDB server instance
    • Optimizing a Kubernetes application
    • Leveraging Ansible to automate AWS instance management
    • Guidelines for optimizing AWS EC2 instances
    • Optimizing an Oracle Database server instance
    • Optimizing an Oracle Database for an e-commerce service
    • Guidelines for optimizing Oracle RDS
    • Optimizing a MySQL server database running Sysbench
    • Optimizing a MySQL server database running OLTPBench
    • Optimizing a live full-stack deployment (K8s + JVM)
    • Setup Instana integration
Powered by GitBook
On this page
  • Linux
  • JVM
  • Kubernetes workload
  • Kubernetes Pod
  • Kubernetes Container and Docker Container
  • EC2
  • Oracle Database
  • Web Application

Was this helpful?

Export as PDF
  1. Reference
  2. Telemetry metric mapping

Prometheus metrics mapping

This page describes the mapping between metrics provided by Prometheus to Akamas metrics for each supported component type

Component Type
Notes

Linux

Component metric
Prometheus query

cpu_load_avg

node_load1{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}

cpu_num

count(node_cpu_seconds_total{instance=~"$INSTANCE$", job=~"$JOB$", mode="system" %FILTERS%})

cpu_used

sum by (job) (sum by (cpu, job) (rate(node_cpu_seconds_total{instance=~"$INSTANCE$", mode=~"user|system|softirq|irq|nice", job=~"$JOB$" %FILTERS%}[$DURATION$])))

cpu_util

avg by (job) (sum by (cpu, job) (rate(node_cpu_seconds_total{instance=~"$INSTANCE$", mode=~"user|system|softirq|irq|nice", job=~"$JOB$" %FILTERS%}[$DURATION$])))

cpu_util_details

avg by (instance, cpu, mode, job) (sum by (instance, cpu, mode, job) (rate(node_cpu_seconds_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])))

disk_io_inflight_details

node_disk_io_now{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}

disk_iops

sum by (instance, job) (rate(node_disk_writes_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])) + sum by (instance, job) (rate(node_disk_reads_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]))

disk_iops_details

sum by (instance, device, job) (rate(node_disk_writes_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]))

disk_iops_details

sum by (instance, device, job) (rate(node_disk_reads_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]))

disk_iops_details

sum by (instance, device, job) (rate(node_disk_writes_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])) + sum by (instance, device, job) (rate(node_disk_reads_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]))

disk_iops_reads

sum by (instance, job) (rate(node_disk_reads_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]))

disk_iops_writes

sum by (instance, job) (rate(node_disk_writes_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]))

disk_read_bytes

sum by (instance, device, job) (rate(node_disk_read_bytes_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]))

disk_read_bytes_details

sum by (instance, device, job) (rate(node_disk_read_bytes_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]))

disk_read_write_bytes

sum by (instance, device, job) (rate(node_disk_written_bytes_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]) + rate(node_disk_read_bytes_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]))

disk_response_time

avg by (instance, job) ((rate(node_disk_read_time_seconds_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]) + rate(node_disk_write_time_seconds_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])) / (rate(node_disk_reads_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]) + rate(node_disk_writes_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]) > 0 ))

disk_response_time_details

avg by (instance, device, job) ((rate(node_disk_read_time_seconds_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]) + rate(node_disk_write_time_seconds_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])) / ((rate(node_disk_reads_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]) + rate(node_disk_writes_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])) > 0))

disk_response_time_read

rate(node_disk_read_time_seconds_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])/ rate(node_disk_reads_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])

disk_response_time_worst

max by (instance, job) ((rate(node_disk_read_time_seconds_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]) + rate(node_disk_write_time_seconds_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])) / (rate(node_disk_reads_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]) + rate(node_disk_writes_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]) > 0 ))

disk_response_time_write

rate(node_disk_write_time_seconds_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])/ rate(node_disk_writes_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])

disk_swap_used

node_memory_SwapTotal_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%} - node_memory_SwapFree_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}

disk_swap_util

((node_memory_SwapTotal_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%} - node_memory_SwapFree_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}) / (node_memory_SwapTotal_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%} > 0)) or ((node_memory_SwapTotal_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%} - node_memory_SwapFree_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}))

disk_util_details

rate(node_disk_io_time_seconds_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])

disk_write_bytes

sum by (instance, device, job) (rate(node_disk_written_bytes_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]))

disk_write_bytes_details

sum by (instance, device, job) (rate(node_disk_written_bytes_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]))

filesystem_size

node_filesystem_size_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}

filesystem_used

node_filesystem_size_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%} - node_filesystem_free_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}

filesystem_util

((node_filesystem_size_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%} - node_filesystem_free_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}) / node_filesystem_size_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%})

mem_fault_major

rate(node_vmstat_pgmajfault{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])

mem_fault_minor

rate(node_vmstat_pgfault{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])

mem_swapins

rate(node_vmstat_pswpin{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])

mem_swapouts

rate(node_vmstat_pswpout{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])

mem_total

node_memory_MemTotal_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}

mem_used

(node_memory_MemTotal_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%} - node_memory_MemFree_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%})

mem_util

(node_memory_MemTotal_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%} - node_memory_MemFree_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}) / node_memory_MemTotal_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}

mem_util_details

(node_memory_Active_file_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%} / node_memory_MemTotal_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%})

mem_util_details

(node_memory_Active_anon_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%} / node_memory_MemTotal_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%})

mem_util_details

(node_memory_Inactive_file_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%} / node_memory_MemTotal_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%})

mem_util_details

(node_memory_Inactive_anon_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%} / node_memory_MemTotal_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%})

mem_util_nocache

(node_memory_MemTotal_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%} - node_memory_Buffers_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%} - node_memory_Cached_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%} - node_memory_MemFree_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}) / node_memory_MemTotal_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}

network_in_bytes_details

rate(node_network_receive_bytes_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])

network_out_bytes_details

rate(node_network_transmit_bytes_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])

network_tcp_retrans

rate(node_netstat_Tcp_RetransSegs{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])

os_context_switch

rate(node_context_switches_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])

proc_blocked

node_procs_blocked{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}

JVM

Component metric
Prometheus query

jvm_heap_size

avg(jvm_memory_bytes_max{area="heap" %FILTERS%})

jvm_heap_committed

avg(jvm_memory_bytes_committed{area="heap" %FILTERS%})

jvm_heap_used

avg(jvm_memory_bytes_used{area="heap" %FILTERS%})

jvm_off_heap_used

avg(jvm_memory_bytes_used{area="nonheap" %FILTERS%})

jvm_heap_util

avg(jvm_memory_bytes_used{area="heap" %FILTERS%} / jvm_memory_bytes_max{area="heap" %FILTERS%})

jvm_memory_used

avg(sum by (instance) (jvm_memory_bytes_used))

jvm_heap_young_gen_size

avg(sum by (instance) (jvm_memory_pool_bytes_max{pool=~".*Eden Space|.*Survivor Space" %FILTERS%}))

jvm_heap_young_gen_used

avg(sum by (instance) (jvm_memory_pool_bytes_used{pool=~".*Eden Space|.*Survivor Space" %FILTERS%}))

jvm_heap_old_gen_size

avg(sum by (instance) (jvm_memory_pool_bytes_max{pool=~".*Tenured Gen|.*Old Gen" %FILTERS%}))

jvm_heap_old_gen_used

avg(sum by (instance) (jvm_memory_pool_bytes_used{pool=~".*Tenured Gen|.*Old Gen" %FILTERS%}))

jvm_memory_buffer_pool_used

avg(sum by (instance) (jvm_buffer_pool_used_bytes))

jvm_gc_time

avg(sum by (instance) (rate(jvm_gc_collection_seconds_sum[$DURATION$])))

jvm_gc_count

avg(sum by (instance) (rate(jvm_gc_collection_seconds_count[$DURATION$])))

jvm_gc_duration

(sum(rate(jvm_gc_collection_seconds_sum[$DURATION$])) / sum(rate(jvm_gc_collection_seconds_count[$DURATION$])) > 0 ) or sum(rate(jvm_gc_collection_seconds_count[$DURATION$]))

jvm_threads_current

avg(jvm_threads_current)

jvm_threads_deadlocked

avg(jvm_threads_deadlocked)

transactions_response_time

avg(rate(ResponseTime_sum{code="200", job=~"$JOB$" %FILTERS%}[$DURATION$])/rate(ResponseTime_count{code="200", job=~"$JOB$" %FILTERS%}[$DURATION$])>0)

transactions_response_time_max

max(rate(ResponseTime_sum{code="200", job=~"$JOB$" %FILTERS%}[$DURATION$])/rate(ResponseTime_count{code="200", job=~"$JOB$" %FILTERS%}[$DURATION$])>0)

transactions_response_time_min

min(rate(ResponseTime_sum{code="200", job=~"$JOB$" %FILTERS%}[$DURATION$])/rate(ResponseTime_count{code="200", job=~"$JOB$" %FILTERS%}[$DURATION$])>0)

transactions_response_time_p50

ResponseTime{quantile="0.5", code="200", job=~"$JOB$" %FILTERS%}

transactions_response_time_p85

ResponseTime{quantile="0.85", code="200", job=~"$JOB$" %FILTERS%}

transactions_response_time_p90

ResponseTime{quantile="0.9", code="200", job=~"$JOB$" %FILTERS%}

transactions_response_time_p99

ResponseTime{quantile="0.99", code="200", job=~"$JOB$" %FILTERS%}

transactions_throughput

sum(rate(Ratio_success{job=~"$JOB$" %FILTERS%}[$DURATION$]))

transactions_error_throughput

sum(rate(Ratio_failure{job=~"$JOB$" %FILTERS%}[$DURATION$]))

transactions_error_rate

(avg(rate(Ratio_failure{job=~"$JOB$" %FILTERS%}[$DURATION$]))/avg(rate(Ratio_total{job=~"$JOB$" %FILTERS%}[$DURATION$])))*100

users

sum(jmeter_threads{state="active", job=~"$JOB$" %FILTERS%})

Kubernetes workload

Component metric
Prometheus query

k8s_workload_desired_pods

kube_deployment_spec_replicas{namespace=~"$NAMESPACE$", deployment=~"$DEPLOYMENT$" %FILTERS%}

k8s_workload_running_pods

kube_deployment_status_replicas_available{namespace=~"$NAMESPACE$", deployment=~"$DEPLOYMENT$" %FILTERS%}

k8s_workload_ready_pods

kube_deployment_status_replicas_ready{namespace=~"$NAMESPACE$", deployment=~"$DEPLOYMENT$" %FILTERS%}

k8s_workload_cpu_used

1e3 * sum(rate(container_cpu_usage_seconds_total{container="", namespace=~"$NAMESPACE$", pod=~"$DEPLOYMENT$.*" %FILTERS%}[$DURATION$]))

k8s_workload_memory_used

sum(last_over_time(container_memory_usage_bytes{container="", namespace=~"$NAMESPACE$", pod=~"$DEPLOYMENT$.*" %FILTERS%}[$DURATION$]))

k8s_workload_cpu_request

1e3 * sum(kube_pod_container_resource_requests{resource="cpu", namespace=~"$NAMESPACE$", pod=~"$DEPLOYMENT$.*" %FILTERS%})

k8s_workload_cpu_limit

1e3 * sum(kube_pod_container_resource_limits{resource="cpu", namespace=~"$NAMESPACE$", pod=~"$DEPLOYMENT$.*" %FILTERS%})

k8s_workload_memory_request

sum(kube_pod_container_resource_requests{resource="memory", namespace=~"$NAMESPACE$", pod=~"$DEPLOYMENT$.*" %FILTERS%})

k8s_workload_memory_limit

sum(kube_pod_container_resource_limits{resource="memory", namespace=~"$NAMESPACE$", pod=~"$DEPLOYMENT$.*" %FILTERS%})

Kubernetes Pod

Component metric
Prometheus metric

k8s_pod_cpu_used

1e3 * avg(rate(container_cpu_usage_seconds_total{container="", namespace=~"$NAMESPACE$", pod=~"$POD$" %FILTERS%}[$DURATION$]))

k8s_pod_cpu_request

1e3 * avg(sum by (pod) (kube_pod_container_resource_requests{resource="cpu", namespace=~"$NAMESPACE$", pod=~"$POD$" %FILTERS%}))

k8s_pod_cpu_limit

1e3 * avg(sum by (pod) (kube_pod_container_resource_limits{resource="cpu", namespace=~"$NAMESPACE$", pod=~"$POD$" %FILTERS%}))

k8s_pod_memory_used

avg(last_over_time(container_memory_usage_bytes{container="", namespace=~"$NAMESPACE$", pod=~"$POD$" %FILTERS%}[$DURATION$]))

k8s_pod_memory_working_set

avg(container_memory_working_set_bytes{container="", namespace=~"$NAMESPACE$", pod=~"$POD$" %FILTERS%})

k8s_pod_memory_request

avg(sum by (pod) (kube_pod_container_resource_requests{resource="memory", namespace=~"$NAMESPACE$", pod=~"$POD$" %FILTERS%}))

k8s_pod_memory_limit

avg(sum by (pod) (kube_pod_container_resource_limits{resource="memory", namespace=~"$NAMESPACE$", pod=~"$POD$" %FILTERS%}))

k8s_pod_restarts

avg(sum by (pod) (increase(kube_pod_container_status_restarts_total{namespace=~"$NAMESPACE$", pod=~"$POD$" %FILTERS%}[$DURATION$])))

Kubernetes Container and Docker Container

The following metrics are configured to work for Kubernetes. When using the Docker optimization pack, override the required metrics in the telemetry instance configuration.

Component metric
Prometheus query

container_cpu_used

1e3 * avg(rate(container_cpu_usage_seconds_total{namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%}[$DURATION$]))

container_cpu_used_max

1e3 * max(rate(container_cpu_usage_seconds_total{namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%}[$DURATION$]))

container_cpu_util

avg(rate(container_cpu_usage_seconds_total{namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%}[$DURATION$]) / on (pod) group_left kube_pod_container_resource_limits{resource="cpu", namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%})

container_cpu_util_max

max(rate(container_cpu_usage_seconds_total{namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%}[$DURATION$]) / on (pod) group_left kube_pod_container_resource_limits{resource="cpu", namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%})

container_cpu_throttled_millicores

1e3 * avg(rate(container_cpu_cfs_throttled_seconds_total{namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%}[$DURATION$]))

container_cpu_throttle_time

avg(last_over_time(container_cpu_cfs_throttled_periods_total{namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%}[$DURATION$]) / container_cpu_cfs_periods_total{namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%})

container_memory_used

avg(last_over_time(container_memory_working_set_bytes{namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%}[$DURATION$]))

container_memory_used_max

max(last_over_time(container_memory_working_set_bytes{namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%}[$DURATION$]))

container_memory_util

avg(last_over_time(container_memory_working_set_bytes{namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%}[$DURATION$]) / on (pod) group_left kube_pod_container_resource_limits{resource="memory", namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%})

container_memory_util_max

max(last_over_time(container_memory_working_set_bytes{namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%}[$DURATION$]) / on (pod) group_left kube_pod_container_resource_limits{resource="memory", namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%})

container_memory_resident_set

avg(last_over_time(container_memory_rss{namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%}[$DURATION$]))

container_memory_cache

avg(last_over_time(container_memory_cache{namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%}[$DURATION$]))

container_cpu_request

1e3 * avg(kube_pod_container_resource_requests{resource="cpu", namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%})

container_cpu_limit

1e3 * avg(kube_pod_container_resource_limits{resource="cpu", namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%})

container_memory_request

avg(kube_pod_container_resource_requests{resource="memory", namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%})

container_memory_limit

avg(kube_pod_container_resource_limits{resource="memory", namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%})

container_restarts

avg(increase(kube_pod_container_status_restarts_total{namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%}[$DURATION$]))

container_oom_kills_count

avg(increase(container_oom_events_total{namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%}[$DURATION$]))

cost

sum(kube_pod_container_resource_requests{resource="cpu", namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%})*29 + sum(kube_pod_container_resource_requests{resource="memory", namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%})/1024/1024/1024*8

EC2

Component metric
Prometheus query

cpu_util

aws_resource_info{instance='$INSTANCE$', job='$JOB$' %FILTERS%} * on(instance_id) group_left() aws_ec2_cpuutilization_average{job='$JOB$'}/100

network_in_bytes_details

aws_resource_info{instance='$INSTANCE$', job='$JOB$' %FILTERS%} * on(instance_id) group_left() (aws_ec2_network_in_sum{job='$JOB$'} * count_over_time(aws_ec2_network_in_sum{job='$JOB$'}[300s]) / 300)

network_out_bytes_details

aws_resource_info{instance='$INSTANCE$', job='$JOB$' %FILTERS%} * on(instance_id) group_left() (aws_ec2_network_out_sum{job='$JOB$'} * count_over_time(aws_ec2_network_out_sum{job='$JOB$'}[300s]) / 300)

aws_ec2_credits_cpu_available

aws_resource_info{instance='$INSTANCE$', job='$JOB$' %FILTERS%} * on(instance_id) group_left() aws_ec2_cpucredit_balance_average{job='$JOB$'}

aws_ec2_credits_cpu_used

aws_resource_info{instance='$INSTANCE$', job='$JOB$' %FILTERS%} * on(instance_id) group_left() aws_ec2_cpucredit_usage_sum{job='$JOB$'}

disk_read_bytes

aws_resource_info{instance='$INSTANCE$', job='$JOB$' %FILTERS%} * on(instance_id) group_left() (aws_ec2_ebsread_bytes_sum{job='$JOB$'} * count_over_time(aws_ec2_ebsread_bytes_sum{job='$JOB$'}[300s]) / 300)

disk_write_bytes

aws_resource_info{instance='$INSTANCE$', job='$JOB$' %FILTERS%} * on(instance_id) group_left() (aws_ec2_ebswrite_bytes_sum{job='$JOB$'} * count_over_time(aws_ec2_ebswrite_bytes_sum{job='$JOB$'}[300s]) / 300)

aws_ec2_disk_iops

aws_resource_info{instance='$INSTANCE$', job='$JOB$' %FILTERS%} * on(instance_id) group_left() ((aws_ec2_ebsread_ops_sum{job='$JOB$'} + aws_ec2_ebswrite_ops_sum{job='$JOB$'}) * count_over_time(aws_ec2_ebsread_ops_sum{job='$JOB$'}[300s])/300)

aws_ec2_disk_iops_reads

aws_resource_info{instance='$INSTANCE$', job='$JOB$' %FILTERS%} * on(instance_id) group_left() (aws_ec2_ebsread_ops_sum{job='$JOB$'} * count_over_time(aws_ec2_ebsread_ops_sum{job='$JOB$'}[300s]) / 300)

aws_ec2_disk_iops_writes

aws_resource_info{instance='$INSTANCE$', job='$JOB$' %FILTERS%} * on(instance_id) group_left() (aws_ec2_ebswrite_ops_sum{job='$JOB$'} * count_over_time(aws_ec2_ebswrite_ops_sum{job='$JOB$'}[300s]) / 300)

aws_ec2_ebs_credits_io_util

aws_resource_info{instance='$INSTANCE$', job='$JOB$' %FILTERS%} * on(instance_id) group_left() aws_ec2_ebsiobalance__average{job='$JOB$'} / 100

aws_ec2_ebs_credits_bytes_util

aws_resource_info{instance='$INSTANCE$', job='$JOB$' %FILTERS%} * on(instance_id) group_left() aws_ec2_ebsbyte_balance__average{job='$JOB$'} / 100

Oracle Database

Component metric
Prometheus query

oracle_sga_total_size

oracledb_memory_size{component='SGA Target', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

oracle_sga_free_size

oracledb_memory_size{component='Free SGA Memory Available', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

oracle_sga_max_size

oracledb_memory_size{component='Maximum SGA Size', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

oracle_pga_target_size

oracledb_memory_size{component='PGA Target', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

oracle_redo_buffers_size

oracledb_memory_size{component='Redo Buffers', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

oracle_default_buffer_cache_size

oracledb_memory_size{component='DEFAULT buffer cache', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

oracle_default_2k_buffer_cache_size

oracledb_memory_size{component='DEFAULT 2K buffer cache', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

oracle_default_4k_buffer_cache_size

oracledb_memory_size{component='DEFAULT 4K buffer cache', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

oracle_default_8k_buffer_cache_size

oracledb_memory_size{component='DEFULT 8K buffer cache', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

oracle_default_16k_buffer_cache_size

oracledb_memory_size{component='DEFAULT 16K buffer cache', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

oracle_default_32k_buffer_cache_size

oracledb_memory_size{component='DEFAULT 32K buffer cache', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

oracle_keep_buffer_cache_size

oracledb_memory_size{component='KEEP buffer cache', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

oracle_recycle_buffer_cache_size

oracledb_memory_size{component='RECYCLE buffer cache', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

oracle_asm_buffer_cache_size

oracledb_memory_size{component='ASM Buffer Cache', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

oracle_shared_io_pool_size

oracledb_memory_size{component='Shared IO Pool', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

oracle_java_pool_size

oracledb_memory_size{component='java pool', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

oracle_large_pool_size

oracledb_memory_size{component='large pool', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

oracle_shared_pool_size

oracledb_memory_size{component='shared pool', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

oracle_streams_pool_size

oracledb_memory_size{component='streams pool', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

oracle_sessions_active_user

oracledb_sessions_value{type='USER', status='ACTIVE', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

oracle_sessions_inactive_user

oracledb_sessions_value{type='USER', status='INACTIVE', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

oracle_sessions_active_background

oracledb_sessions_value{type='BACKGROUND', status='ACTIVE', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

oracle_sessions_inactive_background

oracledb_sessions_value{type='BACKGROUND', status='INACTIVE', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

oracle_buffer_cache_hit_ratio

ttps://docs.oracle.com/database/121/TGDBA/tune_buffer_cache.htm#TGDBA533

oracle_redo_log_space_requests

rate(oracledb_activity_redo_log_space_requests{instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])

oracle_wait_event_log_file_sync

rate(oracledb_system_event_time_waited{event='log file sync', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])/100

oracle_wait_event_log_file_parallel_write

rate(oracledb_system_event_time_waited{event='log file sequential read', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])/100

oracle_wait_event_log_file_sequential_read

rate(oracledb_system_event_time_waited{event='log file parallel write', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])/100

oracle_wait_event_enq_tx_contention

rate(oracledb_system_event_time_waited{event='enq: TX - contention', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])/100

oracle_wait_event_enq_tx_row_lock_contention

rate(oracledb_system_event_time_waited{event='enq: TX - row lock contention', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])/100

oracle_wait_event_latch_row_cache_objects

rate(oracledb_system_event_time_waited{event='latch: row cache objects', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])/100

oracle_wait_event_latch_shared_pool

rate(oracledb_system_event_time_waited{event='latch: shared pool', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])/100

oracle_wait_event_resmgr_cpu_quantum

rate(oracledb_system_event_time_waited{event='resmgr:cpu quantum', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])/100

oracle_wait_event_sql_net_message_from_client

rate(oracledb_system_event_time_waited{event='SQL*Net message from client', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])/100

oracle_wait_event_rdbms_ipc_message

rate(oracledb_system_event_time_waited{event='rdbms ipc message', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])/100

oracle_wait_event_db_file_sequential_read

rate(oracledb_system_event_time_waited{event='db file sequential read', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])/100

oracle_wait_event_log_file_switch_checkpoint_incomplete

rate(oracledb_system_event_time_waited{event='log file switch (checkpoint incomplete)', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])/100

oracle_wait_event_row_cache_lock

rate(oracledb_system_event_time_waited{event='row cache lock', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])/100

oracle_wait_event_buffer_busy_waits

rate(oracledb_system_event_time_waited{event='buffer busy waits', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])/100

oracle_wait_event_db_file_async_io_submit

rate(oracledb_system_event_time_waited{event='db file async I/O submit', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])/100

oracle_wait_class_commit

sum without(event) (rate(oracledb_system_event_time_waited{wait_class='Commit', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$]))/100

oracle_wait_class_concurrency

sum without(event) (rate(oracledb_system_event_time_waited{wait_class='Concurrency', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$]))/100

oracle_wait_class_system_io

sum without(event) (rate(oracledb_system_event_time_waited{wait_class='System I/O', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$]))/100

oracle_wait_class_user_io

sum without(event) (rate(oracledb_system_event_time_waited{wait_class='User I/O', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$]))/100

oracle_wait_class_other

sum without(event) (rate(oracledb_system_event_time_waited{wait_class='Other', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$]))/100

oracle_wait_class_scheduler

sum without(event) (rate(oracledb_system_event_time_waited{wait_class='Scheduler', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$]))/100

oracle_wait_class_idle

sum without(event) (rate(oracledb_system_event_time_waited{wait_class='Idle', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$]))/100

oracle_wait_class_application

sum without(event) (rate(oracledb_system_event_time_waited{wait_class='Application', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$]))/100

oracle_wait_class_network

sum without(event) (rate(oracledb_system_event_time_waited{wait_class='Network', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$]))/100

oracle_wait_class_configuration

sum without(event) (rate(oracledb_system_event_time_waited{wait_class='Configuration', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$]))/100

Web Application

Component metric
Prometheus query

transactions_response_time

avg(rate(ResponseTime_sum{code="200", job=~"$JOB$" %FILTERS%}[$DURATION$])/rate(ResponseTime_count{code="200", job=~"$JOB$" %FILTERS%}[$DURATION$])>0)

transactions_response_time_max

max(rate(ResponseTime_sum{code="200", job=~"$JOB$" %FILTERS%}[$DURATION$])/rate(ResponseTime_count{code="200", job=~"$JOB$" %FILTERS%}[$DURATION$])>0)

transactions_response_time_min

min(rate(ResponseTime_sum{code="200", job=~"$JOB$" %FILTERS%}[$DURATION$])/rate(ResponseTime_count{code="200", job=~"$JOB$" %FILTERS%}[$DURATION$])>0)

transactions_response_time_p50

ResponseTime{quantile="0.5", code="200", job=~"$JOB$" %FILTERS%}

transactions_response_time_p85

ResponseTime{quantile="0.85", code="200", job=~"$JOB$" %FILTERS%}

transactions_response_time_p90

ResponseTime{quantile="0.9", code="200", job=~"$JOB$" %FILTERS%}

transactions_response_time_p99

ResponseTime{quantile="0.99", code="200", job=~"$JOB$" %FILTERS%}

transactions_throughput

sum(rate(Ratio_success{job=~"$JOB$" %FILTERS%}[$DURATION$]))

transactions_error_throughput

sum(rate(Ratio_failure{job=~"$JOB$" %FILTERS%}[$DURATION$]))

transactions_error_rate

(avg(rate(Ratio_failure{job=~"$JOB$" %FILTERS%}[$DURATION$]))/avg(rate(Ratio_total{job=~"$JOB$" %FILTERS%}[$DURATION$])))*100

users

sum(jmeter_threads{state="active", job=~"$JOB$" %FILTERS%})

Last updated 1 year ago

Was this helpful?

The default metrics in this table are based on the and

The default metrics in this table are based on the and

The default metrics in this table are based on the , configured with the attached

The default metrics in this table are based on the , extending the default queries with the attached

The default metrics in this table are based on the

cadvisor
kube-state-metrics
cadvisor
kube-state-metrics
Prometheus Listener for Jmeter
Linux
JVM
Kubernetes Container
Kubernetes Pod
EC2
Oracle Database
Web Application
CloudWatch Exporter
custom configuration file
OracleDB Exporter
custom configuration file