All pages
Powered by GitBook
1 of 1

Loading...

CloudWatch Exporter

This page describes how to set up a CloudWatch exporter in order to gather AWS metrics through the Prometheus provider. This is especially useful to monitor system metrics when you don’t have direct SSH access to AWS resources like EC2 Instances or if you want to gather AWS-specific metrics not available in the guest OS.

AWS policies

In order to fetch metrics fromCloudWatch, the exporter requires an IAM user or role with the following privileges:

  • cloudwatch:GetMetricData

  • cloudwatch:GetMetricStatistics

  • cloudwatch:ListMetrics

  • tag:GetResources

You can assign AWS-managed policies CloudWatchReadOnlyAccess and ResourceGroupsandTagEditorReadOnlyAccess to the desired user to enable these permissions.

Exporter configuration

The CloudWatch exporter repository is available on the . It requires a minimal configuration to fetch metrics from the desired AWS instances. Below is a short list of the parameters needed for a minimal configuration:

  • region: AWS region of the monitored resource

  • metrics: a list of objects containing filters for the exported metrics

    • aws_namespace: the namespace of the monitored resource

For a complete list of possible values for namespaces, metrics, and dimensions please refer to the official .

Notice: AWS bills CloudWatch usage in batches of 1 million requests, where every metric counts as a single request. To avoid unnecessary expenses configure only the metrics you need.

The suggested deployment mode for the exporter is through a . The following snippet provides a command line example to run the container (remember to provide your AWS credentials if needed and the path of the configuration file):

You can refer to the for more details or alternative deployment modes.

Prometheus configuration

In order to scrape the newly created exporter add a new job to the configuration file. You will also need to define some in order to add the instance label required by Akamas to properly filter the incoming metrics. In the example below the instance label is copied from the instance’s Name tag:

Notice: AWS bills CloudWatch usage in batches of 1 million requests, where every metric counts as a single request. To avoid unnecessary expenses configure an appropriate scraping interval.

Additional workflow task

Once you configured the exporter in the Prometheus configuration you can start to fetch metrics using the Prometheus provider. The following sections describe some scripts you can add as tasks in your workflow.

Wait for metrics

It’s worth noting that CloudWatch may require some minutes to aggregate the stats according to the configured granularity, causing the telemetry provider to fail while trying to fetch data points not available yet. To avoid such issues you can add at the end of your workflow a task using an to wait for the CloudWatch metrics to be ready. The following script is an example of implementation:

Start/stop the exporter as needed

Since Amazon bills your CloudWatch queries is wise to run the exporter only when needed. The following script allows you to manage the exporter from the workflow by adding the following tasks:

  • start the container right before the beginning of the load test (command: bash script.sh start)

  • stop the container after the metrics publication, as described in the (command: bash script.sh stop).

Custom Configuration file

The example below is the Akamas-supported configuration, fetching metrics of EC2 instances named server1 and server2.

aws_metric_name: the name of the AWS metric to fetch
  • aws_dimensions: the dimension to expose as labels

  • aws_dimension_select: the dimension to filter over

  • aws_statistics: the list of metric statistics to expose

  • aws_tag_select: optional tags to filter on

    • tag_selections: map containing the list of values to select for each tag

    • resource_type_selection: resource type to fetch the tags from (see: Resource Types)

    • resource_id_dimension: dimension to use for the resource id (see: )

  • official project page
    AWS CloudWatch User Guide
    Docker image
    official guide
    relabeling rules
    Executor operator
    previous section
    region: us-east-2
    metrics:
      - aws_namespace: AWS/EC2
        aws_metric_name: CPUUtilization
        aws_statistics: [Average]
        aws_dimensions: [InstanceId]
        # aws_dimension_select:
        #   InstanceId: [i-XXXXXXXXXXXXXXXXX]
        aws_tag_select:
          tag_selections:
            Name: [server1, server2]
          resource_type_selection: ec2:instance
          resource_id_dimension: InstanceId
    
      - aws_namespace: AWS/EC2
        aws_metric_name: NetworkIn
        aws_statistics: [Sum]
        aws_dimensions: [InstanceId]
        # aws_dimension_select:
        #   InstanceId: [i-XXXXXXXXXXXXXXXXX]
        aws_tag_select:
          tag_selections:
            Name: [server1, server2]
          resource_type_selection: ec2:instance
          resource_id_dimension: InstanceId
    
      - aws_namespace: AWS/EC2
        aws_metric_name: NetworkOut
        aws_statistics: [Sum]
        aws_dimensions: [InstanceId]
        # aws_dimension_select:
        #   InstanceId: [i-XXXXXXXXXXXXXXXXX]
        aws_tag_select:
          tag_selections:
            Name: [server1, server2]
          resource_type_selection: ec2:instance
          resource_id_dimension: InstanceId
    
      - aws_namespace: AWS/EC2
        aws_metric_name: NetworkPacketsIn
        aws_statistics: [Sum]
        aws_dimensions: [InstanceId]
        # aws_dimension_select:
        #   InstanceId: [i-XXXXXXXXXXXXXXXXX]
        aws_tag_select:
          tag_selections:
            Name: [server1, server2]
          resource_type_selection: ec2:instance
          resource_id_dimension: InstanceId
    
      - aws_namespace: AWS/EC2
        aws_metric_name: NetworkPacketsOut
        aws_statistics: [Sum]
        aws_dimensions: [InstanceId]
        # aws_dimension_select:
        #   InstanceId: [i-XXXXXXXXXXXXXXXXX]
        aws_tag_select:
          tag_selections:
            Name: [server1, server2]
          resource_type_selection: ec2:instance
          resource_id_dimension: InstanceId
    
      - aws_namespace: AWS/EC2
        aws_metric_name: CPUCreditUsage
        aws_statistics: [Sum]
        aws_dimensions: [InstanceId]
        # aws_dimension_select:
        #   InstanceId: [i-XXXXXXXXXXXXXXXXX]
        aws_tag_select:
          tag_selections:
            Name: [server1, server2]
          resource_type_selection: ec2:instance
          resource_id_dimension: InstanceId
    
      - aws_namespace: AWS/EC2
        aws_metric_name: CPUCreditBalance
        aws_statistics: [Average]
        aws_dimensions: [InstanceId]
        # aws_dimension_select:
        #   InstanceId: [i-XXXXXXXXXXXXXXXXX]
        aws_tag_select:
          tag_selections:
            Name: [server1, server2]
          resource_type_selection: ec2:instance
          resource_id_dimension: InstanceId
    
      - aws_namespace: AWS/EC2
        aws_metric_name: EBSReadOps
        aws_statistics: [Sum]
        aws_dimensions: [InstanceId]
        # aws_dimension_select:
        #   InstanceId: [i-XXXXXXXXXXXXXXXXX]
        aws_tag_select:
          tag_selections:
            Name: [server1, server2]
          resource_type_selection: ec2:instance
          resource_id_dimension: InstanceId
    
      - aws_namespace: AWS/EC2
        aws_metric_name: EBSWriteOps
        aws_statistics: [Sum]
        aws_dimensions: [InstanceId]
        # aws_dimension_select:
        #   InstanceId: [i-XXXXXXXXXXXXXXXXX]
        aws_tag_select:
          tag_selections:
            Name: [server1, server2]
          resource_type_selection: ec2:instance
          resource_id_dimension: InstanceId
    
      - aws_namespace: AWS/EC2
        aws_metric_name: EBSReadBytes
        aws_statistics: [Sum]
        aws_dimensions: [InstanceId]
        # aws_dimension_select:
        #   InstanceId: [i-XXXXXXXXXXXXXXXXX]
        aws_tag_select:
          tag_selections:
            Name: [server1, server2]
          resource_type_selection: ec2:instance
          resource_id_dimension: InstanceId
    
      - aws_namespace: AWS/EC2
        aws_metric_name: EBSWriteBytes
        aws_statistics: [Sum]
        aws_dimensions: [InstanceId]
        # aws_dimension_select:
        #   InstanceId: [i-XXXXXXXXXXXXXXXXX]
        aws_tag_select:
          tag_selections:
            Name: [server1, server2]
          resource_type_selection: ec2:instance
          resource_id_dimension: InstanceId
    
      - aws_namespace: AWS/EC2
        aws_metric_name: EBSIOBalance%
        aws_statistics: [Average]
        aws_dimensions: [InstanceId]
        # aws_dimension_select:
        #   InstanceId: [i-XXXXXXXXXXXXXXXXX]
        aws_tag_select:
          tag_selections:
            Name: [server1, server2]
          resource_type_selection: ec2:instance
          resource_id_dimension: InstanceId
    
      - aws_namespace: AWS/EC2
        aws_metric_name: EBSByteBalance%
        aws_statistics: [Average]
        aws_dimensions: [InstanceId]
        # aws_dimension_select:
        #   InstanceId: [i-XXXXXXXXXXXXXXXXX]
        aws_tag_select:
          tag_selections:
            Name: [server1, server2]
          resource_type_selection: ec2:instance
          resource_id_dimension: InstanceId
    docker run -d --name cloudwatch_exporter \
      -p 9106:9106 \
      -v $(pwd)/cloudwatch-exporter.yaml:/config/config.yml \
      -e AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID} -e AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY} \
      prom/cloudwatch-exporter
    scrape_configs:
      - job_name: cloudwatch_exporter
        scrape_interval: 60s
        scrape_timeout: 30s
        static_configs:
          - targets: [cloudwatch_exporter:9106]
        metric_relabel_configs:
          - source_labels: [tag_Name]
            regex: '(.+)'
            target_label: instance
    METRIC=aws_rds_cpuutilization_sum   # metric to check for
    DELAY_SEC=15
    RETRIES=60
    
    NOW=`date +'%FT%T.%3NZ'`
    
    for i in `seq $RETRIES`; do
      sleep $DELAY_SEC
      curl -sS "http://prometheus_host/api/v1/query?query=${METRIC}&time=${NOW}" | jq -ce '.data.result[]' && exit 0
    done
    
    exit 255
    #!/bin/bash
    
    set -e
    
    CMD=$1
    CONT_NAME=cloudwatch_exporter
    
    stop_cont() {
      [ -z `docker ps -aq -f "name=${CONT_NAME}"` ] || (echo Removing ${CONT_NAME} && docker rm -f ${CONT_NAME})
    }
    
    case $CMD in
      stop|remove)
        stop_cont
        ;;
    
      start)
        stop_cont
    
        AWS_ACCESS_KEY_ID=`awk 'BEGIN { FS = "=" } /aws_access_key_id/ {print $2 }' ~/.aws/credentials | tr -d '[:space:]'`
        AWS_SECRET_ACCESS_KEY=`awk 'BEGIN { FS = "=" } /aws_secret_access_key/ {print $2 }' ~/.aws/credentials | tr -d '[:space:]'`
    
        echo Starting container $CONT_NAME
        docker run -d --name $CONT_NAME \
          -p 9106:9106 \
          -v ~/oracle-database/utils/cloudwatch-exporter.yaml:/config/config.yml \
          -e AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID} -e AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY} \
          prom/cloudwatch-exporter
        ;;
    
        *)
        echo Unrecognized option $CMD
        exit 255
        ;;
    esac
    region: us-east-2
    metrics:
      - aws_namespace: AWS/EC2
        aws_metric_name: CPUUtilization
        aws_statistics: [Average]
        aws_dimensions: [InstanceId]
        # aws_dimension_select:
        #   InstanceId: [i-XXXXXXXXXXXXXXXXX]
        aws_tag_select:
          tag_selections:
            Name: [server1, server2]
          resource_type_selection: ec2:instance
          resource_id_dimension: InstanceId
    
      - aws_namespace: AWS/EC2
        aws_metric_name: NetworkIn
        aws_statistics: [Sum]
        aws_dimensions: [InstanceId]
        # aws_dimension_select:
        #   InstanceId: [i-XXXXXXXXXXXXXXXXX]
        aws_tag_select:
          tag_selections:
            Name: [server1, server2]
          resource_type_selection: ec2:instance
          resource_id_dimension: InstanceId
    
      - aws_namespace: AWS/EC2
        aws_metric_name: NetworkOut
        aws_statistics: [Sum]
        aws_dimensions: [InstanceId]
        # aws_dimension_select:
        #   InstanceId: [i-XXXXXXXXXXXXXXXXX]
        aws_tag_select:
          tag_selections:
            Name: [server1, server2]
          resource_type_selection: ec2:instance
          resource_id_dimension: InstanceId
    
      - aws_namespace: AWS/EC2
        aws_metric_name: NetworkPacketsIn
        aws_statistics: [Sum]
        aws_dimensions: [InstanceId]
        # aws_dimension_select:
        #   InstanceId: [i-XXXXXXXXXXXXXXXXX]
        aws_tag_select:
          tag_selections:
            Name: [server1, server2]
          resource_type_selection: ec2:instance
          resource_id_dimension: InstanceId
    
      - aws_namespace: AWS/EC2
        aws_metric_name: NetworkPacketsOut
        aws_statistics: [Sum]
        aws_dimensions: [InstanceId]
        # aws_dimension_select:
        #   InstanceId: [i-XXXXXXXXXXXXXXXXX]
        aws_tag_select:
          tag_selections:
            Name: [server1, server2]
          resource_type_selection: ec2:instance
          resource_id_dimension: InstanceId
    
      - aws_namespace: AWS/EC2
        aws_metric_name: CPUCreditUsage
        aws_statistics: [Sum]
        aws_dimensions: [InstanceId]
        # aws_dimension_select:
        #   InstanceId: [i-XXXXXXXXXXXXXXXXX]
        aws_tag_select:
          tag_selections:
            Name: [server1, server2]
          resource_type_selection: ec2:instance
          resource_id_dimension: InstanceId
    
      - aws_namespace: AWS/EC2
        aws_metric_name: CPUCreditBalance
        aws_statistics: [Average]
        aws_dimensions: [InstanceId]
        # aws_dimension_select:
        #   InstanceId: [i-XXXXXXXXXXXXXXXXX]
        aws_tag_select:
          tag_selections:
            Name: [server1, server2]
          resource_type_selection: ec2:instance
          resource_id_dimension: InstanceId
    
      - aws_namespace: AWS/EC2
        aws_metric_name: EBSReadOps
        aws_statistics: [Sum]
        aws_dimensions: [InstanceId]
        # aws_dimension_select:
        #   InstanceId: [i-XXXXXXXXXXXXXXXXX]
        aws_tag_select:
          tag_selections:
            Name: [server1, server2]
          resource_type_selection: ec2:instance
          resource_id_dimension: InstanceId
    
      - aws_namespace: AWS/EC2
        aws_metric_name: EBSWriteOps
        aws_statistics: [Sum]
        aws_dimensions: [InstanceId]
        # aws_dimension_select:
        #   InstanceId: [i-XXXXXXXXXXXXXXXXX]
        aws_tag_select:
          tag_selections:
            Name: [server1, server2]
          resource_type_selection: ec2:instance
          resource_id_dimension: InstanceId
    
      - aws_namespace: AWS/EC2
        aws_metric_name: EBSReadBytes
        aws_statistics: [Sum]
        aws_dimensions: [InstanceId]
        # aws_dimension_select:
        #   InstanceId: [i-XXXXXXXXXXXXXXXXX]
        aws_tag_select:
          tag_selections:
            Name: [server1, server2]
          resource_type_selection: ec2:instance
          resource_id_dimension: InstanceId
    
      - aws_namespace: AWS/EC2
        aws_metric_name: EBSWriteBytes
        aws_statistics: [Sum]
        aws_dimensions: [InstanceId]
        # aws_dimension_select:
        #   InstanceId: [i-XXXXXXXXXXXXXXXXX]
        aws_tag_select:
          tag_selections:
            Name: [server1, server2]
          resource_type_selection: ec2:instance
          resource_id_dimension: InstanceId
    
      - aws_namespace: AWS/EC2
        aws_metric_name: EBSIOBalance%
        aws_statistics: [Average]
        aws_dimensions: [InstanceId]
        # aws_dimension_select:
        #   InstanceId: [i-XXXXXXXXXXXXXXXXX]
        aws_tag_select:
          tag_selections:
            Name: [server1, server2]
          resource_type_selection: ec2:instance
          resource_id_dimension: InstanceId
    
      - aws_namespace: AWS/EC2
        aws_metric_name: EBSByteBalance%
        aws_statistics: [Average]
        aws_dimensions: [InstanceId]
        # aws_dimension_select:
        #   InstanceId: [i-XXXXXXXXXXXXXXXXX]
        aws_tag_select:
          tag_selections:
            Name: [server1, server2]
          resource_type_selection: ec2:instance
          resource_id_dimension: InstanceId
    
    Resource Types