# Prometheus

Akamas Insights supports multiple Prometheus-compatible data sources, including standard **Prometheus**, **Thanos**, and **Amazon Managed Prometheus** (AMP). This page covers configuration for self-hosted Prometheus and AWS-managed deployments. For Grafana Cloud's Prometheus implementation, see the [Grafana Cloud configuration page](/insights/connecting-your-data/datasources/grafana-cloud.md).

## Metric Sources & Prerequisites

Akamas Insights collects Kubernetes and application runtime metrics from your Prometheus-compatible endpoint. Before connecting your data source, ensure the following components are deployed and configured in your cluster. Two setup configurations are supported: the **Kubernetes Monitoring Stack** and **OpenTelemetry Kubernetes Metrics**.

### Kubernetes Monitoring Stack

The simplest way to ensure all required metrics are available is to deploy the [kube-prometheus-stack](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack) Helm chart. It bundles all the necessary components:

* [kube-state-metrics](https://github.com/kubernetes/kube-state-metrics) — Kubernetes object state metrics
* [Prometheus node exporter](https://github.com/prometheus/node_exporter) — Node hardware and OS metrics
* Kubernetes recording rules from the [kubernetes-mixin](https://github.com/kubernetes-monitoring/kubernetes-mixin) project

{% hint style="warning" %}
**Recording rules are required.** Akamas Insights depends on pre-computed Prometheus recording rules provided by the kubernetes-mixin. These rules are automatically included in kube-prometheus-stack, but they are **not** available in standalone Prometheus or Mimir deployments unless explicitly configured.

Without these recording rules, the data import will fail to discover and extract workload, pod, and node metrics.
{% endhint %}

**Node Labels for Cluster Autoscaler Recommendations**

To provide optimization recommendations for Cluster Autoscaler node groups, Akamas Insights needs access to Kubernetes node labels (instance types, topology, node group identifiers, etc.).

By default, kube-state-metrics does not expose all labels on the `kube_node_labels` metric. You must explicitly allow them using the `metricLabelsAllowlist` setting:

**Helm values (`values.yaml`):**

```yaml
kube-state-metrics:
  metricLabelsAllowlist:
    - nodes=[*]
```

Or, if deploying kube-state-metrics standalone:

```yaml
metricLabelsAllowlist:
  - nodes=[*]
```

{% hint style="info" %}
Without this configuration, Akamas Insights will still collect and optimize workload metrics, but Cluster Autoscaler node group recommendations will not be available.
{% endhint %}

### OpenTelemetry Kubernetes Metrics

If your environment uses the [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/) instead of (or alongside) kube-state-metrics, Akamas Insights can collect Kubernetes metrics from OTel `k8s_*` metrics.

{% hint style="info" %}
OTel Kubernetes metrics do **not** require the kubernetes-mixin recording rules. Workload discovery and resource metrics are available directly from OTel without pre-computed aggregations.
{% endhint %}

The following OpenTelemetry Collector components are required:

| Component                                                                                                                                | Purpose                                                                                                                               |
| ---------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------- |
| [k8s Cluster Receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/k8sclusterreceiver)          | Node, pod, container, and workload discovery (`k8s_node_info`, `k8s_pod_info`, `k8s_container_ready`, `k8s_deployment_desired`, etc.) |
| [Kubelet Stats Receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/kubeletstatsreceiver)      | Resource metrics (`k8s_container_cpu_request`, `k8s_container_memory_limit_bytes`, `k8s_node_allocatable_cpu`, etc.)                  |
| [K8s Attributes Processor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/k8sattributesprocessor) | Enriches metrics with workload labels (`deployment`, `statefulset`, `daemonset`, `job_name`)                                          |

**Example OTel Collector configuration:**

```yaml
receivers:
  k8s_cluster:
    collection_interval: 30s
    auth_type: serviceAccount
  kubeletstats:
    collection_interval: 30s
    auth_type: serviceAccount
    endpoint: "https://${env:K8S_NODE_NAME}:10250"
    insecure_skip_verify: true

processors:
  k8sattributes:
    auth_type: serviceAccount
    extract:
      metadata:
        - k8s.deployment.name
        - k8s.statefulset.name
        - k8s.daemonset.name
        - k8s.job.name
        - k8s.cronjob.name

exporters:
  prometheusremotewrite:
    endpoint: "http://your-prometheus-endpoint/api/v1/write"

service:
  pipelines:
    metrics:
      receivers: [k8s_cluster, kubeletstats]
      processors: [k8sattributes]
      exporters: [prometheusremotewrite]
```

### Application Runtime Metrics

Beyond Kubernetes infrastructure metrics, Akamas Insights can analyze application runtime data to provide full-stack optimization recommendations, including JVM heap sizing and Node.js V8 heap configuration, all coordinated with Kubernetes resource limits. This requires your applications to expose runtime metrics via one of the supported exporters.

#### Label Requirements

Runtime metrics must include labels that identify the Kubernetes **node**, **namespace**, **pod**, and **container** where the application runs. Akamas Insights uses these labels to correlate runtime metrics with the corresponding Kubernetes resources.

The specific label names used by your exporter are configured during the data import setup in Akamas Insights (e.g., `node` vs `k8s_node_name`, `pod` vs `k8s_pod_name`).

#### Java

Akamas Insights supports three Java instrumentation methods. Only one is required per application.

[**Prometheus JMX Exporter**](https://github.com/prometheus/jmx_exporter) - supported: [1.0.1](https://github.com/prometheus/jmx_exporter/releases/tag/1.0.1)+

Required metrics:

* `jvm_memory_used_bytes`, `jvm_memory_committed_bytes`, `jvm_memory_max_bytes` - labels: `area`
* `jvm_memory_pool_used_bytes`, `jvm_memory_pool_committed_bytes`, `jvm_memory_pool_max_bytes` - labels: `pool`
* `jvm_gc_collection_seconds_sum`, `jvm_gc_collection_seconds_count` - labels: `gc`
* `jvm_threads_current`

[**Micrometer / Spring Boot Actuator**](https://micrometer.io/) - supported: [Micrometer 1.14](https://github.com/micrometer-metrics/micrometer/releases/tag/v1.14.0)+ / [Spring Boot 3.4](https://github.com/spring-projects/spring-boot/releases/tag/v3.4.0)+

Required metrics:

* `jvm_memory_used_bytes`, `jvm_memory_committed_bytes`, `jvm_memory_max_bytes` - labels: `area`, `id`
* `jvm_gc_pause_seconds_sum`, `jvm_gc_pause_seconds_count` - labels: `action`, `cause`, `gc`
* `jvm_gc_concurrent_phase_time_seconds_sum`, `jvm_gc_concurrent_phase_time_seconds_count` - labels: `cause`
* `jvm_threads_live_threads`

[**OpenTelemetry Java Agent**](https://opentelemetry.io/docs/languages/java/) - supported: [2.25.0](https://github.com/open-telemetry/opentelemetry-java-instrumentation/releases/tag/v2.25.0)+

Required metrics:

* `jvm_memory_used_bytes`, `jvm_memory_committed_bytes`, `jvm_memory_limit_bytes` - labels: `jvm_memory_type`, `jvm_memory_pool_name`
* `jvm_gc_duration_seconds_sum`, `jvm_gc_duration_seconds_count` - labels: `jvm_gc_action`, `jvm_gc_name`
* `jvm_thread_count` - labels: `jvm_thread_state`, `jvm_thread_daemon`

#### Node.js

[**OpenTelemetry Node.js SDK**](https://opentelemetry.io/docs/languages/js/) - supported: [@opentelemetry/instrumentation-runtime-node 0.22.0](https://www.npmjs.com/package/@opentelemetry/instrumentation-runtime-node)+

Required metrics:

* `v8js_memory_heap_used_bytes`, `v8js_memory_heap_space_physical_size_bytes`, `v8js_memory_heap_limit_bytes` - labels: `v8js_heap_space_name`
* `v8js_gc_duration_seconds_sum`, `v8js_gc_duration_seconds_count`, `v8js_gc_duration_seconds_bucket` - labels: `v8js_gc_type`

## Default Prometheus

[Prometheus](https://prometheus.io/) is an open-source monitoring and alerting toolkit designed for reliability and scalability in cloud-native environments.

### Requirements

Before connecting Prometheus to Akamas Insights, ensure you have:

#### Required Information

* **Endpoint URL**: The HTTP endpoint of your Prometheus server (e.g., `http://prometheus-server:9090`);
* **Custom Headers** (optional): Additional HTTP headers for authentication or routing.

#### Authentication Options

Akamas Insights supports Prometheus instances with and without authentication:

* **No Authentication**: Default Prometheus installations without security enabled;
* **Custom Headers**: For authenticated Prometheus variants like Thanos, you can configure custom HTTP headers:
  * **Bearer Token Authentication**: Add `Authorization: Bearer <token>` header;
  * **Basic Authentication**: Add `Authorization: Basic <base64-credentials>` header;
  * **API Key Headers**: Add custom headers like `X-API-Key: <key>` if required by your setup;
  * **Proxy Headers**: Add routing or identification headers required by reverse proxies.

{% hint style="info" %}
For Grafana Cloud Prometheus, use the dedicated [Grafana Cloud configuration](/insights/connecting-your-data/datasources/grafana-cloud.md) which handles authentication automatically.
{% endhint %}

#### Network Access

* **Network connectivity** from Akamas Insights to your Prometheus endpoint;
* **Firewall** rules allowing HTTP/HTTPS traffic to the Prometheus API.

{% hint style="info" %}
You can obtain this information from the following references:

* [Prometheus HTTP API](https://prometheus.io/docs/prometheus/latest/querying/api/)
* [Prometheus Configuration](https://prometheus.io/docs/prometheus/latest/configuration/configuration/)
  {% endhint %}

### Configuration

To connect to a Prometheus instance:

1. **Endpoint URL**: Enter the Prometheus server URL;
   * Format: `http://prometheus-server:9090`
2. **Test Connection**: Verify connectivity;
3. **Save**: Save the integration.

## Amazon Managed Prometheus (AMP)

[Amazon Managed Prometheus](https://aws.amazon.com/prometheus/) is a fully managed Prometheus-compatible monitoring service that provides secure, scalable metrics storage and querying for container workloads.

### Requirements

Before connecting Amazon Managed Prometheus to Akamas Insights, ensure you have:

#### Required Credentials

* **Endpoint URL**: Your AMP workspace query endpoint in the format `https://aps-workspaces.{region}.amazonaws.com/workspaces/{workspace-id}/`;
* **AWS Access Key**: IAM access key ID with AMP read permissions;
* **AWS Secret Key**: Corresponding IAM secret access key;
* **AWS Region**: The AWS region where your AMP workspace is deployed (e.g., `us-east-1`).

#### Required IAM Permissions

Your IAM credentials must have the following permissions:

* `aps:QueryMetrics`: Query metrics from the AMP workspace;
* `aps:GetMetricMetadata`: Retrieve metadata about available metrics;

{% hint style="info" %}
You can obtain this information from the following references:

* [AWS AMP IAM Permissions](https://docs.aws.amazon.com/prometheus/latest/userguide/AMP-and-IAM.html)
* [AWS AMP Query API](https://docs.aws.amazon.com/prometheus/latest/userguide/AMP-onboard-query.html)
  {% endhint %}

### Configuration

To connect to Amazon Managed Prometheus:

1. **Endpoint URL**: Enter your workspace query endpoint;
   * Format: `https://aps-workspaces.{region}.amazonaws.com/workspaces/{workspace-id}/`
2. **Access Key**: IAM access key with read permissions;
3. **Secret Key**: Corresponding secret key;
4. **Region**: The region where your workspace resides;
5. **Test Connection**: Verify the configuration;
6. **Save**: Save the integration.

## Data Import Settings

These advanced settings allow you to tune the data extraction process for Prometheus-compatible data sources to optimize performance and manage query load. In most cases, the default values work well, but you can adjust them based on your environment size and query performance.

* **Retries**: Number of retry attempts for failed Prometheus queries (default: 1). Increase this value if you experience intermittent connectivity issues or timeout errors. Higher values increase resilience but may slow down extraction when encountering persistent failures;
* **Parallelism**: Number of concurrent worker threads for parallel query execution (default: 5). Increase for faster extraction if your Prometheus instance can handle higher query rates, or decrease to reduce load on the server;
* **Batch Size**: Number of entities to batch in a single Prometheus query (default: 20). Determines how many pods, containers, or other entities are included in a single query. Increase to reduce total number of queries (faster extraction), or decrease to reduce individual query complexity and memory usage.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.akamas.io/insights/connecting-your-data/datasources/prometheus.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
