This page describes how to set up a CloudWatch exporter in order to gather AWS metrics through the Prometheus provider. This is especially useful to monitor system metrics when you don’t have direct SSH access to AWS resources like EC2 Instances or if you want to gather AWS-specific metrics not available in the guest OS.
In order to fetch metrics fromCloudWatch, the exporter requires an IAM user or role with the following privileges:
cloudwatch:GetMetricData
cloudwatch:GetMetricStatistics
cloudwatch:ListMetrics
tag:GetResources
You can assign AWS-managed policies CloudWatchReadOnlyAccess and ResourceGroupsandTagEditorReadOnlyAccess to the desired user to enable these permissions.
The CloudWatch exporter repository is available on the official project page. It requires a minimal configuration to fetch metrics from the desired AWS instances. Below is a short list of the parameters needed for a minimal configuration:
region: AWS region of the monitored resource
metrics: a list of objects containing filters for the exported metrics
aws_namespace: the namespace of the monitored resource
aws_metric_name: the name of the AWS metric to fetch
aws_dimensions: the dimension to expose as labels
aws_dimension_select: the dimension to filter over
aws_statistics: the list of metric statistics to expose
aws_tag_select: optional tags to filter on
tag_selections: map containing the list of values to select for each tag
resource_type_selection: resource type to fetch the tags from (see: Resource Types)
resource_id_dimension: dimension to use for the resource id (see: Resource Types)
For a complete list of possible values for namespaces, metrics, and dimensions please refer to the official AWS CloudWatch User Guide.
Notice: AWS bills CloudWatch usage in batches of 1 million requests, where every metric counts as a single request. To avoid unnecessary expenses configure only the metrics you need.
The suggested deployment mode for the exporter is through a Docker image. The following snippet provides a command line example to run the container (remember to provide your AWS credentials if needed and the path of the configuration file):
You can refer to the official guide for more details or alternative deployment modes.
In order to scrape the newly created exporter add a new job to the configuration file. You will also need to define some relabeling rules in order to add the instance
label required by Akamas to properly filter the incoming metrics.
In the example below the instance
label is copied from the instance’s Name
tag:
Notice: AWS bills CloudWatch usage in batches of 1 million requests, where every metric counts as a single request. To avoid unnecessary expenses configure an appropriate scraping interval.
Once you configured the exporter in the Prometheus configuration you can start to fetch metrics using the Prometheus provider. The following sections describe some scripts you can add as tasks in your workflow.
It’s worth noting that CloudWatch may require some minutes to aggregate the stats according to the configured granularity, causing the telemetry provider to fail while trying to fetch data points not available yet. To avoid such issues you can add at the end of your workflow a task using an Executor operator to wait for the CloudWatch metrics to be ready. The following script is an example of implementation:
Since Amazon bills your CloudWatch queries is wise to run the exporter only when needed. The following script allows you to manage the exporter from the workflow by adding the following tasks:
start the container right before the beginning of the load test (command: bash script.sh start
)
stop the container after the metrics publication, as described in the previous section (command: bash script.sh stop
).
The example below is the Akamas-supported configuration, fetching metrics of EC2 instances named server1 and server2.
The Prometheus provider collects metrics from a Prometheus instance and makes them available to Akamas.
This provider includes support for several technologies (Prometheus exporters). In any case, custom queries can be defined to gather the desired metrics.
This section provides the minimum requirements that you should match before using the Prometheus provider.
Akamas supports Prometheus starting from version2.26.
Using also theprometheus-operator
requires Prometheus 0.47 or greater. This version is bundled with the kube-prometheus-stack
since version 15.
Connectivity between the Akamas server and the Prometheus server is also required. By default, Prometheus is run on port 9090.
Node exporter (Linux system metrics)
JMX exporter (Java metrics)
cAdvisor (Docker container metrics)
CloudWatch exporter (AWS resources metrics)
Jmeter (Web application metrics)
The Prometheus provider includes queries for most of the monitoring use cases these exporters cover. If you need to specify custom queries or make use of exporters not currently supported you can specify them as described in creating Prometheus telemetry instances.
Kubernetes (Pod, Container, Workload, Namespace)
Web Application
Java (java-ibm-j9vm-6, java-ibm-j9vm-8, java-eclipse-openj9-11, java-openjdk-8, java-openjdk-11)
Linux (Ubuntu-16.04, Rhel-7.6)
Refer to Prometheus provider metrics mapping to see how component-type metrics are extracted by this provider.
Akamas reasons in terms of a system to be optimized and in terms of parameters and metrics of components of that system. To understand which metrics collected from Prometheus should be mapped to a component, the Prometheus provider looks up some properties in the components of a system grouped under prometheus
property. These properties depend on the exporter and the component type.
Nested under this property you can also include any additional field your use case may require to filter the imported metrics further. These fields will be appended in queries to the list of label matches in the form field_name=~'field_value'
, and can specify either exact values or patterns.
Notice: you should configure your Prometheus instances so that the Prometheus provider can leverage the instance
property of components, as described in the Setup datasource section here above.
It is important that you add instance
and, optionally, the job
properties to the components of a system so that the Prometheus provider can gather metrics from them:
The Prometheus provider does not usually require a specific configuration of the Prometheus instance it uses.
When gathering metrics for hosts it's usually convenient to set the value of the instance
label so that it matches the value of the instance
property in a component; in this way, the Prometheus provider knows which system component each data point refers to.
Here’s an example configuration for Prometheus that sets the instance
label:
To install the Prometheus provider, create a YAML file (provider.yml
in this example) with the definition of the provider:
Then you can install the provider using the Akamas CLI:
The installed provider is shared with all users of your Akamas installation and can monitor many different systems, by configuring appropriate telemetry provider instances.
This page describes how to set up an OracleDB exporter in order to gather metrics regarding an Oracle Database instance through the Prometheus provider.
The OracleDB exporter repository is available on the official project page. The suggested deploy mode is through a Docker image, since the Prometheus instance can easily access the running container through the Akamas network.
Use the following command line to run the container, where cust-metrics.toml
is your configuration file defining the queries for additional custom metrics (see paragraph below) and DATA_SOURCE_NAME
an environment variable containing the Oracle EasyConnect string:
You can refer to the official guide for more details or alternative deployment modes.
It is possible to define additional queries to expose custom metrics using any data in the database instance that is readable by the monitoring user (see the guide for more details about the syntax).
The following is an example of exporting system metrics from the Dynamic Performance (V$) Views used by the Prometheus provider default queries for the Oracle Database optimization pack:
To create an instance of the Prometheus provider, edit a YAML file (instance.yml
in this example) with the definition of the instance:
Then you can create the instance for the system
using the Akamas CLI:
When you create an instance of the Prometheus provider, you should specify some configuration information to allow the provider to extract and process metrics from Prometheus correctly.
You can specify configuration information within the config
part of the YAML of the instance definition.
address
, a URL or IP identifying the address of the host where Prometheus is installed
port
, the port exposed by Prometheus
user
, the username for the Prometheus service
password
, the user password for the Prometheus service
job
, a string to specify the scraping job name. The default is ".*" for all scraping jobs
logLevel
, set this to "DETAILED" for some extra logs when searching for metrics (default value is "INFO")
headers
, to specify additional custom headers
e.g: headers:
"custom_key": "custom_value"
namespace
, a string to specify the namespace
duration
, integer to determine the duration in seconds for data collection (use a number between 1 and 3600)
enableHttps
, boolean to enable HTTPS in Prometheus (since 3.2.6)
ignoreCertificates
, boolean to ignore SSL certificates
disableConnectionCheck
, boolean to disable initial connection check to Prometheus
The Prometheus provider allows defining additional queries to populate custom metrics or redefine the default ones according to your use case. You can configure additional metrics using the metrics
field as shown in the configuration below:
In this example, the telemetry instance will populate cust_metric
with the results of the query specified in datasource
, maintaining the value of the labels listed under labels
.
Please refer to Querying basics | Prometheus for a complete reference of PromQL
Akamas pre-processes the queries before running them, replacing special-purpose placeholders with the fields provided in the components. For example, given the following component definition:
the query sum(jvm_memory_used_bytes{instance=~"$INSTANCE$", job=~"$JOB$"})
will be expanded for this component into sum(jvm_memory_used_bytes{instance=~"service01", job=~"jmx"})
. This provides greater flexibility through the templatization of the queries, allowing the same query to select the correct data sources for different components.
The following is the list of available placeholders:
This section reports common use cases addressed by this provider.
To gather kubernetes metrics, the following exporters are required:
kube-state-metrics
cadvisor
As an example, you can define a component with type Kubernetes Container
in this way:
Check Java OpenJDK page for a list of all the Java metrics available in Akamas
You can leverage the Prometheus provider to collect Java metrics by using the JMX Exporter. The JMX Exporter is a collector of Java metrics for Prometheus that can be run as an agent for any Java application. Once downloaded, you execute it alongside a Java application with this command:
The command will expose on localhost on port 9100 Java metrics of youJar.jar
__ which can be scraped by Prometheus.
config.yaml
is a configuration file useful for the activity of this exporter. It is suggested to use this configuration for an optimal experience with the Prometheus provider:
As a next step, add a new scraping target in the configuration of the Prometheus used by the provider:
You can then create a YAML file with the definition of a telemetry instance (prom_instance.yml
) of the Prometheus provider:
And you can create the telemetry instance using the Akamas CLI:
Finally, to bind the extracted metrics to the related component, you should add the following field to the properties
of the component’s definition:
Check the Linux page for a list of all the system metrics available in Akamas
You can leverage the Prometheus provider to collect system metrics (Linux) by using the Node exporter. The Node exporter is a collector of system metrics for Prometheus that can be run as a standalone executable or a service within a Linux machine to be monitored. Once downloaded, schedule it as a service using, for example, systemd:
Here’s the manifest of the node_exporter
service:
The service will expose on localhost on port 9100 system metrics __ which can be scraped by Prometheus.
As a final step, add a new scraping target in the configuration of the Prometheus used by the provider:
You can then create a YAML file with the definition of a telemetry instance (prom_instance.yml
) of the Prometheus provider:
And you can create the telemetry instance using the Akamas CLI:
Finally, to bind the extracted metrics to the related component, you should add the following field to the properties
of the component’s definition:
Placeholder | Usage example | Component definition example | Expanded query | Description |
---|---|---|---|---|
$INSTANCE$
, $JOB$
node_load1{instance=~"$INSTANCE$", job=~"$JOB$"}
See Example below
node_load1{instance=~"frontend", job=~"node"}
These placeholders are replaced respectively with the instance
and job
fields configured in the component’s prometheus
configuration.
%FILTERS%
container_memory_usage_bytes{job=~"$JOB$" %FILTERS%}
See Example below
container_memory_usage_bytes{job=~"advisor", name=~"db-.*"}
This placeholder is replaced with a list containing any additional filter in the component’s definition (other than instance
and job
), where each field is expanded as field_name=~"field_value"
. This is useful to define additional label matches in the query without the need to hardcode them.
$DURATION$
rate(http_client_requests_seconds_count[$DURATION$])
rate(http_client_requests_seconds_count[30s])
If not set in the component properties, this placeholder is replaced with the duration field configured in the telemety-instance. You should use it with range vectors instead of hardcoding a fixed value.
$NAMESPACE$
, $POD$
, $CONTAINER$
1e3 * avg(kube_pod_container_resource_limits{resource="cpu", namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%})
1e3 * avg(kube_pod_container_resource_limits{resource="cpu", namespace=~"boutique", pod=~"adservice.*", container=~"server"})
These placeholders are used within kubernetes environments