All pages
Powered by GitBook
1 of 6

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Manage the Akamas Server

This section is a collection of different topics related to how to manage the Akamas Server.

This section covers some topics on how to manage the Akamas Server:

  • Akamas logs

  • Audit logs

  • Install upgrades and patches

Backup & Recovery of the Akamas Server
Monitor the Akamas Server

Akamas logs

Akamas allows dumping log entries from a specific service, workspace, workflow, study, trial, and experiment, for a specific timeframe and at different log levels.

Akamas CLI for logs

Akamas logs can be dumped via the following CLI command:

akamas log

This command provides many filters which can be retrieved with the following command:

akamas log --help

which should return

For example, to get the list of the most recent Akamas errors:

akamas log -l ERROR

which should return something similar to:

Usage: akamas log [OPTIONS] [MESSAGE]

  Show Akamas logs

Options:
  -d, --debug                     Show extended error messages if present.
  --page-size INTEGER             Number of log's lines to be retrieved NOTE:
                                  This argument is mutually exclusive with
                                  arguments: [dump, no_pagination].
  --no-pagination                 Disable pagination and print all logs NOTE:
                                  This argument is mutually exclusive with
                                  arguments: [dump, page_size].
  --dump                          Print the logs without pagination and
                                  formatting NOTE: This argument is mutually
                                  exclusive with arguments: [page_size,
                                  no_pagination].
  -f, --from [%Y-%m-%d|%Y-%m-%dT%H:%M:%S|%Y-%m-%d %H:%M:%S|%Y-%m-%dT%H:%M:%S.%f|%Y-%m-%d %H:%M:%S.%f|[-]nw|[-]nd|[-]nh|[-]nm|[-]ns]
                                  The start timestamp of the logs
  -t, --to [%Y-%m-%d|%Y-%m-%dT%H:%M:%S|%Y-%m-%d %H:%M:%S|%Y-%m-%dT%H:%M:%S.%f|%Y-%m-%d %H:%M:%S.%f|[-]nw|[-]nd|[-]nh|[-]nm|[-]ns]
                                  The end timestamp of the logs
  -s, --study TEXT                UUID or name of the Study
  -e, --exp INTEGER               Number of the experiment
  --trial INTEGER                 Number of the trial
  -y, --system TEXT               UUID or name of the System
  -W, --workflow TEXT             UUID or name of the Workflow
  -l, --log-level TEXT            Log level
  -S, --service TEXT              Akamas service
  --without-metadata              Hide metadata
  --sorting [ASC|DESC]            Sorting order of the timestamps
  -ws, --workspace TEXT           UUID or name of the Workspace to visualize.
                                  When empty, system logs will be returned
                                  instead
  --help                          Show this message and exit.
       timestamp                         system                  provider    service                                                                                   message
==============================================================================================================================================================================================================================================================
2022-05-02T15:51:26.88    -                                      -          airflow     Task failed with exception
2022-05-02T15:51:26.899   -                                      -          airflow     Failed to execute job 2 for task Akamas_LogCurator_Task
2022-05-02T15:56:29.195   -                                      -          airflow     Task failed with exception
2022-05-02T15:56:29.215   -                                      -          airflow     Failed to execute job 3 for task Akamas_LogCurator_Task
2022-05-02T16:01:55.587   -                                      -          license     2022-05-02 16:01:47.426 ERROR 1 --- [           main] c.a.m.utils.rest.RestHandlers            :  has failed with returning a response:
                                                                                        {"httpStatus":400,"timestamp":"2022-05-02T16:01:47.413638","error":"Bad Request","message":"The following metrics: 'spark.spark_application_duration' were not found
                                                                                        in any of the components of the system 'analytics_cluster'","path":null}
2022-05-02T16:01:55.587   -                                      -          license     2022-05-02 16:01:47.434 ERROR 1 --- [           main] c.a.m.MigrationApplication               : Unable to complete operation. Mode: RESTORE. Cause: A request to a
                                                                                        downstream service CampaignService has failed: 400 : [{"httpStatus":400,"timestamp":"2022-05-02T16:01:47.413638","error":"Bad Request","message":"The following
                                                                                        metrics: 'spark.spark_application_duration' were not found in any of the components of the system 'analytics_cluster'","path":null}]
2022-05-02T16:01:55.678   -                                      -          license     2022-05-02 16:01:47.434 ERROR 1 --- [           main] c.a.m.MigrationApplication               : Unable to complete operation. Mode: RESTORE. Cause: A request to a
                                                                                        downstream service CampaignService has failed: 400 : [{"httpStatus":400,"timestamp":"2022-05-02T16:01:47.413638","error":"Bad Request","message":"The following
                                                                                        metrics: 'spark.spark_application_duration' were not found in any of the components of the system 'analytics_cluster'","path":null}]
2022-05-02T16:01:55.678   -                                      -          license     2022-05-02 16:01:47.426 ERROR 1 --- [           main] c.a.m.utils.rest.RestHandlers            :  has failed with returning a response:
                                                                                        {"httpStatus":400,"timestamp":"2022-05-02T16:01:47.413638","error":"Bad Request","message":"The following metrics: 'spark.spark_application_duration' were not found
                                                                                        in any of the components of the system 'analytics_cluster'","path":null}
2022-05-02T16:12:10.261   -                                      -          license     2022-05-02 16:05:53.209 ERROR 1 --- [           main] c.a.m.services.CampaignService           : de9f5ff9-418e-4e25-ae2c-12fc8e72cafc
2022-05-02T16:32:07.216   -                                      -          license     2022-05-02 16:31:37.330 ERROR 1 --- [           main] c.a.m.services.CampaignService           : 06c4b858-8353-429c-bacd-0cc56cc44634
2022-05-02T16:38:18.522   -                                      -          campaign    Internal Server Error: Object of class [com.akamas.campaign_service.entities.campaign.experiment.Experiment] with identifier
                                                                                        [ExperimentIdentifier(workspace=ac8481d3-d031-4b6a-8ae9-c7b366f027e8, study=de9f5ff9-418e-4e25-ae2c-12fc8e72cafc, id=2)]: optimistic locking failed; nested exception
                                                                                        is org.hibernate.StaleObjectStateException: Row was updated or deleted by another transaction (or unsaved-value mapping was incorrect) :
                                                                                        [com.akamas.campaign_service.entities.campaign.experiment.Experiment#ExperimentIdentifier(workspace=ac8481d3-d031-4b6a-8ae9-c7b366f027e8,
                                                                                        study=de9f5ff9-418e-4e25-ae2c-12fc8e72cafc, id=2)]

Monitor the Akamas Server

External tools

You can use any monitoring tool to check the availability of the Akamas instance.

Checking Akamas services

To check the status of the Akams services please run akamas status -d to identify which service is not able to start up correctly

Here is an example of output:

Checking Akamas services on http://localhost:8000
 service	 status
=========================
analyzer       	UP
campaign       	UP
metrics        	UP
optimizer      	UP
orchestrator   	UP
system         	UP
telemetry      	UP
license        	UP
log            	UP
users          	UP
OK

Backup & Recover of the Akamas Server

Akamas server backup

The process of backing up an Akamas server can be divided in two parts, that is system backup and otherwise start Akamas. Backup can be performed in any way you see fit: they’re just regular files so you can use any backup tool.

System backup

System services are hosted on AWS ECR repo so the only thing that fully defines a working Akamas application is the docker-compose.yml file. Performing a backup of the Akamas application is as simple as copying this single file to your backup location. you may schedule any script that performs this weekly or at any frequency you see fit

User data backup

You may list all existing Akamas studies via the Akamas CLI command:

Then you can export all existing studies one by one via the CLI command

where UUID is the UUID of a single study. This command exports into a single archive file (tar.gz). These archive files can be backed up to your favorite backup folder.

Akamas server recovery

Akamas server recovery involves recovering the system backup, restarting the Akamas service then re-importing the studies.

System Restore

To restore the system you must recover the original docker-compose.yml then launch the command

from the folder where you placed this YAML file and then wait for the system to come up, by checking it with the command

User data restore

All studies can be re-imported singularly with the CLI command (referring to the correct pathname of the archive):

akamas list study
akamas export study <UUID>
docker-compose up &
akamas status -d
akamas import study archive.tgz

Install upgrades and patches

Akamas patches and upgrades need to be installed by following the specific instructions specified in the package provided. In case of new releases, it is recommended to read the related Release Notes. Under normal circumstances, this usually requires the user to update the docker-compose configuration, as described in the next section.

Docker compose Configuration

When using docker compose to install Akamas, there’s a folder usually named akamas in the user home folder that contains a docker-compose.yml file. This is a YAML text file that contains a list of docker services with the URLs/version pointing to the ECR repo hosting all docker images needed to launch Akamas.

Here’s an excerpt of such a docker-compose.yml file (this example contains 3 services only):

The relevant lines that usually have to be patched during an upgrade are the lines with key "image" like:

In order to update to a new version you should replace the versions (1.7.0 or 2.3.0) after the colon with the new versions (ask your Akamas support for the correct service versions for a specific Akamas release) then you should restart Akamas with the following console commands: First login to Akamas CLI with:

and type username and password as in the example below

Now make sure you have the following AWS variables with the proper value in your Linux user environment:

Then log in to AWS with the following command:

Then pull all new ECR images for the new service versions you just changed (this should be done from when inside the same folder where file docker-compose.yml resides, usually $HOME/akamas/) with the following command:

It should return an output like the following:

Finally, relaunch all services with:

(usage example below)

Wait for a few minutes and check the Akamas services are back up by running the command:

The expected output should be like the following (repeat the command after a minute or two if the last line is not "OK" as expected):

image: 485790562880.dkr.ecr.us-east-2.amazonaws.com/akamas/master-db:1.7.0
akamas login
ubuntu@ak_machine:~/akamas/ $ akamas login
User: akamas
Password:
User akamas logged in. Welcome.
services:
  #####################
  # Database Service #
  #####################
  database:
    image: 485790562880.dkr.ecr.us-east-2.amazonaws.com/akamas/master-db:1.7.0
    container_name: database2
    restart: always
    command: postgres -c max_connections=200

  #####################
  # Optimizer Service #
  #####################
  optimizer:
    image: 485790562880.dkr.ecr.us-east-2.amazonaws.com/akamas/optimizer_service:2.3.0
    container_name: optimizer
    restart: always
    networks:
      - akamas2
    depends_on:
      - database
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - /tmp/build/engine_input:/tmp/build/engine_input

  ####################
  # Campaign Service #
  ####################
  campaign:
    image: 485790562880.dkr.ecr.us-east-2.amazonaws.com/akamas/campaign_service:2.3.0
    container_name: campaign
    restart: always
    volumes:
      - config:/config
    networks:
      - akamas2
    depends_on:
      - database
      - optimizer
      - analyzer
AWS_DEFAULT_REGION
AWS_SECRET_ACCESS_KEY
AWS_ACCESS_KEY_ID
aws ecr get-login-password --region us-east-2 | docker login --username AWS --password-stdin 485790562880.dkr.ecr.us-east-2.amazonaws.com
Login Succeeded
docker-compose pull
Pulling database                ... done
Pulling optimizer               ... done
Pulling elasticsearch           ... done
Pulling log                     ... done
Pulling metrics                 ... done
Pulling telemetry               ... done
Pulling analyzer                ... done
Pulling campaign                ... done
Pulling system                  ... done
Pulling license                 ... done
Pulling store                   ... done
Pulling airflow-db              ... done
Pulling benchmark               ... done
Pulling kong-database           ... done
Pulling kong                    ... done
Pulling user-service            ... done
Pulling keycloak                ... done
Pulling logstash                ... done
Pulling kibana                  ... done
Pulling kong-consumer-init      ... done
Pulling kong-migration          ... done
Pulling keycloak-initializer    ... done
Pulling telemetry-init          ... done
Pulling curator-only-pull-image ... done
Pulling airflow                 ... done
Pulling orchestrator            ... done
Pulling akamas-init             ... done
Pulling akamas-ui               ... done
Pulling pg-admin                ... done
Pulling grafana                 ... done
Pulling prometheus              ... done
Pulling node-exporter           ... done
Pulling cadvisor                ... done
Pulling konga                   ... done
docker-compose up -d
ubuntu@ak_machine:~/akamas/ $ docker compose up -d
pgadmin4 is up-to-date
prometheus is up-to-date
benchmark is up-to-date
kibana is up-to-date
node-exporter is up-to-date
store is up-to-date
grafana is up-to-date
cadvisor is up-to-date
Starting telemetry-init ...
Starting curator-only-pull-image ...
Recreating database2             ...
Recreating airflow-db            ...
Starting kong-initializer        ...
akamas-ui is up-to-date
elasticsearch is up-to-date
Recreating kong-db               ...
Recreating metrics               ...
logstash is up-to-date
Recreating log                   ...
...(some logging follows)
akamas status -d
Checking Akamas services on http://localhost:8000 service status
analyzer UP
campaign UP
metrics UP
optimizer UP
orchestrator UP
system UP
telemetry UP
license UP
log UP
users UP
OK

Audit logs

Akamas audit logs

Akamas stores all its logs into an internal Elasticsearch instance: some of these logs are reported to the user in the GUI in order to ease the monitoring of workflow executions, while other logs are only accessible via CLI and are mostly used to provide more context and information to support requests.

Audit access can be performed by using the CLI in order to extract logs related to UI or API access. For instance, to extract audit logs from the last hour use the following commands:

  • UI Logs

  • API Logs

Notice: to visualize the system logs unrelated to the execution of workflows bound to workspaces, you need an account with administrative privileges.

Storing audit logs into files

To ease the integration with external logging systems, Akamas can be configured to store access logs into files. To enable this feature you should:

  1. Create a logs folder next to the Akamas docker-compose.yml file

  2. Edit the docker-compose.yml file by modifying the line FILE_LOG: "false" to FILE_LOG: "true"

otherwise, start Akamas first.

When the user interacts with the UI or the API Akamas will report detailed access logs both on the internal database and in a file in the logs folder. To ease log rolling and management every day Akamas will create a new file named according to the pattern access-%{+YYYY-MM-dd}.log.

If Akamas is already running issue the following command
akamas logs --no-pagination -S kong -f -1h
akamas logs --no-pagination -S kong -f -1h
docker-compose up -d logstash