Grafana Cloud
Grafana Cloud is a fully managed observability platform that provides metrics, logs, and traces with Prometheus-compatible metrics storage. For self-hosted Prometheus or Amazon Managed Prometheus, see the Prometheus configuration page.
Metric Sources & Prerequisites
Akamas Insights collects Kubernetes and application runtime metrics from your Prometheus-compatible endpoint. Before connecting your data source, ensure the following components are deployed and configured in your cluster. Two setup configurations are supported: the Kubernetes Monitoring Stack and OpenTelemetry Kubernetes Metrics.
Kubernetes Monitoring Stack
The simplest way to ensure all required metrics are available is to deploy the kube-prometheus-stack Helm chart. It bundles all the necessary components:
kube-state-metrics — Kubernetes object state metrics
Prometheus node exporter — Node hardware and OS metrics
Kubernetes recording rules from the kubernetes-mixin project
Recording rules are required. Akamas Insights depends on pre-computed Prometheus recording rules provided by the kubernetes-mixin. These rules are automatically included in kube-prometheus-stack, but they are not available in standalone Prometheus or Mimir deployments unless explicitly configured.
Without these recording rules, the data import will fail to discover and extract workload, pod, and node metrics.
Node Labels for Cluster Autoscaler Recommendations
To provide optimization recommendations for Cluster Autoscaler node groups, Akamas Insights needs access to Kubernetes node labels (instance types, topology, node group identifiers, etc.).
By default, kube-state-metrics does not expose all labels on the kube_node_labels metric. You must explicitly allow them using the metricLabelsAllowlist setting:
Helm values (values.yaml):
kube-state-metrics:
metricLabelsAllowlist:
- nodes=[*]Or, if deploying kube-state-metrics standalone:
Without this configuration, Akamas Insights will still collect and optimize workload metrics, but Cluster Autoscaler node group recommendations will not be available.
OpenTelemetry Kubernetes Metrics
If your environment uses the OpenTelemetry Collector instead of (or alongside) kube-state-metrics, Akamas Insights can collect Kubernetes metrics from OTel k8s_* metrics.
OTel Kubernetes metrics do not require the kubernetes-mixin recording rules. Workload discovery and resource metrics are available directly from OTel without pre-computed aggregations.
The following OpenTelemetry Collector components are required:
Node, pod, container, and workload discovery (k8s_node_info, k8s_pod_info, k8s_container_ready, k8s_deployment_desired, etc.)
Resource metrics (k8s_container_cpu_request, k8s_container_memory_limit_bytes, k8s_node_allocatable_cpu, etc.)
Enriches metrics with workload labels (deployment, statefulset, daemonset, job_name)
Example OTel Collector configuration:
Application Runtime Metrics
Beyond Kubernetes infrastructure metrics, Akamas Insights can analyze application runtime data to provide full-stack optimization recommendations, including JVM heap sizing and Node.js V8 heap configuration, all coordinated with Kubernetes resource limits. This requires your applications to expose runtime metrics via one of the supported exporters.
Label Requirements
Runtime metrics must include labels that identify the Kubernetes node, namespace, pod, and container where the application runs. Akamas Insights uses these labels to correlate runtime metrics with the corresponding Kubernetes resources.
The specific label names used by your exporter are configured during the data import setup in Akamas Insights (e.g., node vs k8s_node_name, pod vs k8s_pod_name).
Java
Akamas Insights supports three Java instrumentation methods. Only one is required per application.
Prometheus JMX Exporter - supported: 1.0.1+
Required metrics:
jvm_memory_used_bytes,jvm_memory_committed_bytes,jvm_memory_max_bytes- labels:areajvm_memory_pool_used_bytes,jvm_memory_pool_committed_bytes,jvm_memory_pool_max_bytes- labels:pooljvm_gc_collection_seconds_sum,jvm_gc_collection_seconds_count- labels:gcjvm_threads_current
Micrometer / Spring Boot Actuator - supported: Micrometer 1.14+ / Spring Boot 3.4+
Required metrics:
jvm_memory_used_bytes,jvm_memory_committed_bytes,jvm_memory_max_bytes- labels:area,idjvm_gc_pause_seconds_sum,jvm_gc_pause_seconds_count- labels:action,cause,gcjvm_gc_concurrent_phase_time_seconds_sum,jvm_gc_concurrent_phase_time_seconds_count- labels:causejvm_threads_live_threads
OpenTelemetry Java Agent - supported: 2.25.0+
Required metrics:
jvm_memory_used_bytes,jvm_memory_committed_bytes,jvm_memory_limit_bytes- labels:jvm_memory_type,jvm_memory_pool_namejvm_gc_duration_seconds_sum,jvm_gc_duration_seconds_count- labels:jvm_gc_action,jvm_gc_namejvm_thread_count- labels:jvm_thread_state,jvm_thread_daemon
Node.js
OpenTelemetry Node.js SDK - supported: @opentelemetry/instrumentation-runtime-node 0.22.0+
Required metrics:
v8js_memory_heap_used_bytes,v8js_memory_heap_space_physical_size_bytes,v8js_memory_heap_limit_bytes- labels:v8js_heap_space_namev8js_gc_duration_seconds_sum,v8js_gc_duration_seconds_count,v8js_gc_duration_seconds_bucket- labels:v8js_gc_type
Requirements
Before connecting Grafana Cloud to Akamas Insights, ensure you have:
Required Credentials
Endpoint URL: Your Grafana Cloud Prometheus endpoint URL
Username: Your Grafana Cloud instance ID (also called user ID)
Access Token: A Grafana Cloud access token with appropriate permissions (see below)
Required Permissions
Your access token must have the following permission:
Metrics:Read: Query metrics from your Grafana Cloud Prometheus instance
You can obtain this information from the following references:
Configuration
To connect to Grafana Cloud:
Endpoint URL: Enter your Grafana Cloud Prometheus endpoint
Username: Your Grafana Cloud instance ID
Access Token: Generate an access token with Metrics:Read permissions
Test Connection: Verify the configuration
Save: Save the integration
Data Import Settings
These advanced settings allow you to tune the data extraction process for Grafana Cloud to optimize performance and manage query load. In most cases, the default values work well, but you can adjust them based on your environment size and performance requirements.
Retries: Number of retry attempts for failed Prometheus queries (default: 1). Increase this value if you experience intermittent connectivity issues or timeout errors. Higher values increase resilience but may slow down extraction when encountering persistent failures.
Parallelism: Number of concurrent worker threads for parallel query execution (default: 5). Increase for faster extraction if your Prometheus instance can handle higher query rates, or decrease to reduce load on the server.
Batch Size: Number of entities to batch in a single Prometheus query (default: 20). Determines how many pods, containers, or other entities are included in a single query. Increase to reduce total number of queries (faster extraction), or decrease to reduce individual query complexity and memory usage.
Last updated
Was this helpful?