Optimizing a MongoDB server instance
In this example study, we are going to optimize a MongoDB single server instance by setting the performance goal of maximizing the throughput of operations toward the database.
Concerning performance tests, we are going to employ YCSB, a popular benchmark created by Yahoo for testing various NoSQL databases.
To extract MongoDB metrics, we are going to spin up a Prometheus instance and we are going to use the MongoDB Prometheus exporter.
You can use a single host for both MongoDB and YCSB but, in the following example, we replicate a common pattern in performance engineering by externalizing the load injection tool into a separate instance to avoid performance issues and measurement noise.
- mongo.mycompany.com for the MongoDB server instance (port 27017) and the MongoDB Prometheus exporter (port 9100)
Notice: in the following, the assumption is to be working with Linux hosts.
To correctly extract MongoDB metrics we can leverage a solution like Prometheus, paired with the MongoDB Prometheus exporter. To do so we would need to:
- 1.Install the MongoDB Prometheus exporter on mongo.mycompany.com
- 2.Install and configure Prometheus on ycsb.mycompany.com
apt-get install prometheus-mongodb-provider
By default, the exporter will expose MongoDB metrics on port 9100
The following YAML file
prometheus.yaml
is an example of the Prometheus configuration that you can use.1
global:
2
scrape_interval: 15s # Set the scrape interval to every 15 seconds. The default is every 1 minute.
3
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
4
scrape_timeout: 15s
5
6
scrape_configs:
7
# Mongo exporter
8
- job_name: 'mongo_exporter'
9
scrape_interval: 15s
10
static_configs:
11
- targets: ['mongo.mycompany.com:9001']
12
relabel_configs:
13
- source_labels: ["__address__"]
14
regex: ".*"
15
target_label: "instance"
16
# this replacement should match the name of the akamas component of MongoDB
17
replacement: "mongo"
Since we are interested in tuning MongoDB by acting on its configuration parameters and by observing its throughput measured using YCSB, we need two components:
- A
mongo
component which represents the MongoDB instance all the configuration parameters and maps directly to mongo.mycompany.com - A
ycsb
component which represents YCSB, in particular, it "houses" the metrics of the performance test, which can be used as parts of the goal of a study. This component maps directly to ucsb.mycompany.com
Here’s the definition of our system (
system.yaml
):1
name: mongodb system
2
description: reference system
Here’s the definition of our
mongo
component (mongo.yaml
):1
name: mongo
2
description: The MongoDB server
3
componentType: MongoDB-4 # MongoDB version 4
4
properties:
5
hostname: mongo.mycompany.com
6
prometheus:
7
# we are telling akamas that this component should be monitored using prometheus, and each data-point with a label instance=mongo should be mapped to this component
8
instance: mongo
9
sshPort: 22
10
username: myusername
11
key: ... RSA KEY ...
Here’s the definition of our
ycsb
component (ycsb.yaml
):1
name: ycsb
2
description: The YCSB client
3
componentType: YCSB
4
properties:
5
hostname: ycsb.mycompany.io
6
instance: ycsb
7
sshPort: 22
8
username: myusername
9
key: ... RSA KEY ...
We can create the system by running:
akamas create system system.yaml
We can then create the components by running:
akamas create component mongo.yaml "mongodb system"
akamas create component ycsb.yaml "mongodb system"
As described in the MongoDB optimization pack page, a workflow for optimizing MongoDB can be structured in three main steps:
- 1.Configure MongoDB with the configuration parameters decided by Akamas
- 2.Test the performance of the application
- 3.Prepare test results
Notice: here we have omitted the Cleanup step because it is not necessary for the context of this study.
We can define a workflow task that uses the FileConfigurator operator to interpolate Akamas parameters into a MongoDB configuration script:
1
name: configure mongo
2
operator: FileConfigurator
3
arguments:
4
sourcePath: /home/myusername/mongo/templates/mongo_launcher.sh.templ # MongoDB configuration script with placeholders for Akamas parameters
5
targetPath: /home/myusername/mongo/launcher.sh # configuration script with interpolated Akamas parameters
6
component: mongo # mongo should match the component of your system that represents your MongoDB instance
Here’s an example of a templated configuration script for MongoDB:
1
#!/bin/sh
2
3
cd "$(dirname "$0")" || exit
4
5
CACHESIZE=${mongo.mongodb_cache_size}
6
SYNCDELAY=${mongo.mongodb_syncdelay}
7
EVICTION_DIRTY_TRIGGER=${mongo.mongodb_eviction_dirty_trigger}
8
EVICTION_DIRTY_TARGET=${mongo.mongodb_eviction_dirty_target}
9
EVICTION_THREADS_MIN=${mongo.mongodb_eviction_threads_min}
10
EVICTION_THREADS_MAX=${mongo.mongodb_eviction_threads_max}
11
EVICTION_TRIGGER=${mongo.mongodb_eviction_trigger}
12
EVICTION_TARGET=${mongo.mongodb_eviction_target}
13
USE_NOATIME=${mongo.mongodb_datafs_use_noatime}
14
15
# Here we have to remount the disk mongodb uses for data, to take advantage of the USE_NOATIME parameter
16
17
sudo service mongod stop
18
sudo umount /mnt/mongodb
19
if [ "$USE_NOATIME" = true ]; then
20
sudo mount /dev/nvme0n1 /mnt/mongodb -o noatime
21
else
22
sudo mount /dev/nvme0n1 /mnt/mongodb
23
fi
24
sudo service mongod start
25
26
# flush logs
27
echo -n | sudo tee /mnt/mongodb/log/mongod.log
28
sudo service mongod restart
29
30
until grep -q "waiting for connections on port 27017" /mnt/mongodb/log/mongod.log
31
do
32
echo "waiting MongoDB..."
33
sleep 60
34
done
35
36
sleep 5
37
sudo service prometheus-mongodb-exporter restart
38
# set knobs
39
mongo --quiet --eval "db.adminCommand({setParameter:1, 'wiredTigerEngineRuntimeConfig': 'cache_size=${CACHESIZE}m, eviction=(threads_min=$EVICTION_THREADS_MIN,threads_max=$EVICTION_THREADS_MAX), eviction_dirty_trigger=$EVICTION_DIRTY_TRIGGER, eviction_dirty_target=$EVICTION_DIRTY_TARGET', eviction_trigger=$EVICTION_TRIGGER, eviction_target=$EVICTION_TARGET})"
40
mongo --quiet --eval "db = db.getSiblingDB('admin'); db.runCommand({ setParameter : 1, syncdelay: $SYNCDELAY})"
41
42
sleep 3Shell
We can add a workflow task that actually executes the MongoDB configuration script produced by the FileConfigurator:
1
name: launch mongo
2
operator: Executor
3
arguments:
4
command: bash /home/myusername/mongo/launcher.sh
5
component: mongo # we can take all the ssh connection parameters from the properties of the mongo component
In each task, we leveraged the reference to the "mongo" component to fetch from its properties all the authentication info to ssh into the right machine e let the FileConfigurator and Executor do their work
We can define a workflow task that uses the Executor operator to launch the YCSB benchmark against MongoDB:
1
name: launch ycsb
2
operator: Executor
3
arguments:
4
command: bash /home/myusername/ycsb/launch_load.sh
5
component: ycsb # we can take all the ssh connection parameters from the properties of the ycsb component
Here’s an example of a launch script for YCSB:
1
#!/bin/bash
2
3
MONGODB_SERVER_IP="mongo.mycompany.com"
4
RECORDCOUNT=30000000
5
RUN_THREADS=10
6
LOAD_THREADS=10
7
DURATION=1800 # 30 minutes
8
WORKLOAD="a"
9
10
cd "$(dirname "$0")" || exit
11
12
# here we use the db_records file to check if we have already loaded the db with data
13
# if not we run a load script
14
db_records=$(cat db_records)
15
if [ "$RECORDCOUNT" != "$db_records" ]; then
16
bash scripts/create_db_mongo.sh ${MONGODB_SERVER_IP} "$RECORDCOUNT" "$LOAD_THREADS" "$WORKLOAD"
17
echo "$RECORDCOUNT" > db_records
18
fi
19
20
cd /home/myuser/ycsb-0.15.0 || exit
21
# launch task in background
22
./bin/ycsb run mongodb-async -s -P workloads/workload"$WORKLOAD" -threads "$RUN_THREADS" -p recordcount="$RECORDCOUNT" -p operationcount=0 -p maxexecutiontime="$DURATION" -p mongodb.url=mongodb://"$MONGODB_SERVER_IP":27017 &> /home/myuser/ycsb/outputRun.txt &
23
PID=$!
24
25
while kill -0 "$PID" >/dev/null 2>&1; do
26
echo running
27
28
if grep -q "java.net.ConnectException: Connection refused (Connection refused)" /home/myuser/ycsb/outputRun.txt; then
29
echo "No connection, killing time!"
30
echo -n > /home/myuser/ycsb/outputRun.txt
31
ps -ef | grep -i com.yahoo.ycsb.Client | awk '{print $2}' | xargs -I{} kill -9 {}
32
echo "Let's wait sometime... maybe Mongo is recovering data??"
33
sleep 900
34
exit 1
35
fi
36
37
if grep -Fxq "Could not create a connection to the server" /home/myuser/ycsb/outputRun.txt; then
38
echo "Unable to connect to server!"
39
kill -9 ${PID}
40
rm /home/myuser/ycsb/outputRun.txt
41
rm /home/myuser/ycsb/db_records
42
else
43
sleep 10
44
fi
45
done
We can define a workflow task that launches a script that parses the YCSB results into a CSV file (Akamas will process the CSV file and then extract performance test metrics):
1
name: parse ycsb results
2
operator: Executor
3
arguments:
4
command: python /home/myusername/ycsb/parser.py
5
component: ycsb # we can take all the ssh connection parameters from the properties of the ycsb component
By putting together all the tasks defined above we come up with the following workflow definition (
workflow.yaml
):1
name: mongo workflow
2
tasks:
3
- name: configure mongo
4
operator: FileConfigurator
5
arguments:
6
sourcePath: /home/myusername/mongo/templates/mongo_launcher.sh.templ # MongoDB configuration script with placeholders for Akamas parameters
7
targetPath: /home/myusername/mongo/launcher.sh # configuration script with interpolated Akamas parameters
8
component: mongo # we can take all the ssh connection parameters from the properties of the mongo component
9
10
- name: launch mongo
11
operator: Executor
12
arguments:
13
command: bash /home/myusername/mongo/launcher.sh
14
component: mongo # mongo should match the component of your system that represents your MongoDB instance
15
16
- name: launch ycsb
17
operator: Executor
18
arguments:
19
command: bash /home/myusername/ycsb/launch_load.sh
20
component: ycsb # we can take all the ssh connection parameters from the properties of the ycsb component
21
22
- name: parse ycsb results
23
operator: Executor
24
arguments:
25
command: python /home/myuser/ycsb/parser.py
26
component: ycsb # we can take all the ssh connection parameters from the properties of the ycsb component
We can create the workflow by running:
akamas create workflow workflow.yaml
Since we are employing Prometheus to extract MongoDB metrics, we can leverage the Prometheus provider to start ingesting data-points into Akamas. To use the Prometheus provider we need to define a telemetry-instance (
prom.yaml
):1
provider: Prometheus # we are using Prometheus
2
config:
3
address: ycsb.mycompany.com # address of Prometheus
4
port: 9090
5
6
metrics:
7
- metric: mongodb_connections_current
8
datasourceMetric: mongodb_connections{instance="$INSTANCE$"}
9
labels:
10
- state
11
- metric: mongodb_heap_used
12
datasourceMetric: mongodb_extra_info_heap_usage_bytes{instance="$INSTANCE$"}
13
- metric: mongodb_page_faults_total
14
datasourceMetric: rate(mongodb_extra_info_page_faults_total{instance="$INSTANCE$"}[$DURATION$])
15
- metric: mongodb_global_lock_current_queue
16
datasourceMetric: mongodb_global_lock_current_queue{instance="$INSTANCE$"}
17
labels:
18
- type
19
- metric: mongodb_mem_used
20
datasourceMetric: mongodb_memory{instance="$INSTANCE$"}
21
labels:
22
- type
23
- metric: mongodb_documents_inserted
24
datasourceMetric: rate(mongodb_metrics_document_total{instance="$INSTANCE$", state="inserted"}[$DURATION$])
25
- metric: mongodb_documents_updated
26
datasourceMetric: rate(mongodb_metrics_document_total{instance="$INSTANCE$", state="updated"}[$DURATION$])
27
- metric: mongodb_documents_deleted
28
datasourceMetric: rate(mongodb_metrics_document_total{instance="$INSTANCE$", state="deleted"}[$DURATION$])
29
- metric: mongodb_documents_returned
30
datasourceMetric: rate(mongodb_metrics_document_total{instance="$INSTANCE$", state="returned"}[$DURATION$])
Notice: the fact that the instance definition contains the specification of Prometheus queries to map to Akamas metrics is temporary. In the next release, these queries will be embedded in Akamas.
By default, $DURATION$ will be replaced with 30s. You can override it to your needs by setting a
duration
property under prometheus
within your mongo
componentWe can now create the telemetry instance and attach it to our system by running:
akamas create telemetry-instance prom.yaml "mongodb system"
Beyond MongoDB metrics, it is important to ingest into Akamas metrics related to the performance tests run with YCSB, in particular the throughput of operations. To achieve this we can leverage the CSV Provider which parses a CSV file to extract relevant metrics. The CSV file we are going to parse with the help of the provider is the one produced in the last task of the workflow of the study.
To start using the provider, we need to define a telemetry instance (
csv.yaml
):1
provider: CSV
2
config:
3
protocol: scp
4
address: ycsb.mycompany.com
5
username: myuser
6
authType: key
7
auth: ... RSA KEY ...
8
remoteFilePattern: /home/ubuntu/ycsb/output.csv
9
csvFormat: horizontal
10
componentColumn: Component
11
timestampColumn: timestamp
12
timestampFormat: yyyy-MM-dd HH:mm:ss
13
14
metrics:
15
# here we put which metric found in the csv provider should be mapped to which akamas metrics
16
# we are only interested in the throughput, but you can add other metrics if you want
17
- metric: throughput
18
datasourceMetric: throughput
19
....
We can create the telemetry instance and attach it to our system by running:
akamas create telemetry-instance csv.yaml "mongodb system"
Our goal for optimizing MongoDB is to maximize its throughput, measured using a performance test executed with YCSB.
Here’s the definition of the goal of our study, to maximize the throughput:
1
goal:
2
objective: maximize
3
function:
4
formula: ycsb.throughput
It is important that the throughput of our MongoDB instance should be considered valid only when it is stable, for this reason, we can use the stability windowing policy. This policy identifies a period of time with at least 100 samples with a standard deviation lower than 200 when the application throughput is maximum.
1
windowing:
2
type: stability
3
stability:
4
# measure the goal function where the throughput has stdDev <= 200 for 100 consecutive data points
5
metric: throughput
6
labels:
7
componentName: ycsb
8
width: 100
9
maxStdDev: 200
10
# take only the temporal window when the throughput is maximum
11
when:
12
metric: throughput
13
is: max
14
labels:
15
componentName: ycsb
We are going to optimize every MongoDB parameter:
1
parametersSelection:
2
- name: mongo.mongodb_syncdelay
3
- name: mongo.mongodb_eviction_dirty_trigger
4
- name: mongo.mongodb_eviction_dirty_target
5
- name: mongo.mongodb_eviction_target
6
- name: mongo.mongodb_eviction_trigger
7
- name: mongo.mongodb_eviction_threads_min
8
- name: mongo.mongodb_eviction_threads_max
9
- name: mongo.mongodb_cache_size
10
# here we have changed the domain of the cache size since we suppose our mongo.mycompany.com host has 32gb of RAM, you should adapt to your own instance
11
domain: [500, 32000]
We are going to add to our study two steps:
- A baseline step, in which we set a cache size of 1GB and use the default values for all the other MongoDB parameters
- An optimize step, in which we perform 100 experiments to generate the best configuration for MongoDB
Here’s what these steps look like:
1
steps:
2
- name: baseline
3
type: baseline
4
values:
5
mongo.mongodb_cache_size: 1024
6
renderParameters:
7
# use also all the other MongoDB parameters at their default value
8
- mongo.*
9
- name: optimize mongo
10
type: optimize
11
numberOfExperiments: 100
Here’s the study definition (
study.yaml
) for optimizing MongoDB:1
name: study to tune MongoDB
2
description: study to tune MongoDB with YCSB perf test
3
system: mongodb system
4
workflow: mongo workflow
5
# Goal
6
goal:
7
objective: maximize
8
function:
9
formula: ycsb.throughput
10
# Windowing
11
windowing:
12
type: stability
13
stability:
14
metric: throughput
15
labels:
16
componentName: ycsb
17
width: 100
18
maxStdDev: 200
19
when:
20
metric: throughput
21
is: max
22
labels:
23
componentName: ycsb
24
# parameters selection
25
parametersSelection:
26
- name: mongo.mongodb_syncdelay
27
- name: mongo.mongodb_eviction_dirty_trigger
28
- name: mongo.mongodb_eviction_dirty_target
29
- name: mongo.mongodb_eviction_target
30
- name: mongo.mongodb_eviction_trigger
31
- name: mongo.mongodb_eviction_threads_min
32
- name: mongo.mongodb_eviction_threads_max
33
- name: mongo.mongodb_cache_size
34
# here we have changed the domain of the cache size since we suppose our mongo.mycompany.com host has 32gb of RAM
35
domain: [500, 32000]
36
- name: mongo.mongodb_datafs_use_noatime
37
parameterConstraints:
38
- name: c1
39
formula: mongo.mongodb_eviction_threads_min <= mongo.mongodb_eviction_threads_max
40
- name: c2
41
formula: mongodb_eviction_dirty_target <= mongodb_eviction_target
42
- name: c3
43
formula: mongodb_eviction_dirty_trigger <= mongodb_eviction_trigger
44
# steps
45
steps:
46
- name: baseline
47
type: baseline
48
values:
49
mongo.mongodb_cache_size: 1024
50
renderParameters:
51
# use also all the other MongoDB parameters at their default value
52
- mongo.*
53
- name: optimize mongo
54
type: optimize
55
numberOfExperiments: 100
You can create the study by running:
akamas create study study.yaml
You can then start it by running:
akamas start study "study to tune MongoDB"
Last modified 1mo ago