In this example study, we are going to optimize a MongoDB single server instance by setting the performance goal of maximizing the throughput of operations toward the database.
Concerning performance tests, we are going to employ YCSB, a popular benchmark created by Yahoo for testing various NoSQL databases.
You can use a single host for both MongoDB and YCSB but, in the following example, we replicate a common pattern in performance engineering by externalizing the load injection tool into a separate instance to avoid performance issues and measurement noise.
mongo.mycompany.com for the MongoDB server instance (port 27017) and the MongoDB Prometheus exporter (port 9100)
Notice: in the following, the assumption is to be working with Linux hosts.
Prometheus and exporters
To correctly extract MongoDB metrics we can leverage a solution like Prometheus, paired with the MongoDB Prometheus exporter. To do so we would need to:
Install the MongoDB Prometheus exporter on mongo.mycompany.com
Install and configure Prometheus on ycsb.mycompany.com
Install the MongoDB Prometheus exporter
You can check how to install the exporter here. On Ubuntu you can use the system package manager:
apt-get install prometheus-mongodb-provider
By default, the exporter will expose MongoDB metrics on port 9100
Install and configure Prometheus
You can check how to configure Prometheus here; by default, it will run on port 9090.
The following YAML fileprometheus.yaml is an example of the Prometheus configuration that you can use.
global:scrape_interval:15s# Set the scrape interval to every 15 seconds. The default is every 1 minute.evaluation_interval:15s# Evaluate rules every 15 seconds. The default is every 1 minute.scrape_timeout:15sscrape_configs:# Mongo exporter - job_name:'mongo_exporter'scrape_interval:15sstatic_configs: - targets: ['mongo.mycompany.com:9001']relabel_configs: - source_labels: ["__address__"]regex:".*"target_label:"instance"# this replacement should match the name of the akamas component of MongoDBreplacement:"mongo"
System
Since we are interested in tuning MongoDB by acting on its configuration parameters and by observing its throughput measured using YCSB, we need two components:
A mongo component which represents the MongoDB instance all the configuration parameters and maps directly to mongo.mycompany.com
A ycsb component which represents YCSB, in particular, it "houses" the metrics of the performance test, which can be used as parts of the goal of a study. This component maps directly to ucsb.mycompany.com
Here’s the definition of our system (system.yaml):
name:mongodb systemdescription:reference system
Here’s the definition of our mongo component (mongo.yaml):
name:mongodescription:The MongoDB servercomponentType:MongoDB-4# MongoDB version 4properties:hostname:mongo.mycompany.comprometheus: # we are telling akamas that this component should be monitored using prometheus, and each data-point with a label instance=mongo should be mapped to this component
instance:mongosshPort:22username:myusernamekey:... RSA KEY ...
Here’s the definition of our ycsb component (ycsb.yaml):
As described in the MongoDB optimization pack page, a workflow for optimizing MongoDB can be structured in three main steps:
Configure MongoDB with the configuration parameters decided by Akamas
Test the performance of the application
Prepare test results
Notice: here we have omitted the Cleanup step because it is not necessary for the context of this study.
Configure MongoDB
We can define a workflow task that uses the FileConfigurator operator to interpolate Akamas parameters into a MongoDB configuration script:
name:configure mongooperator:FileConfiguratorarguments: sourcePath: /home/myusername/mongo/templates/mongo_launcher.sh.templ # MongoDB configuration script with placeholders for Akamas parameters
targetPath:/home/myusername/mongo/launcher.sh# configuration script with interpolated Akamas parameterscomponent:mongo# mongo should match the component of your system that represents your MongoDB instance
Here’s an example of a templated configuration script for MongoDB:
#!/bin/sh
cd "$(dirname "$0")" || exit
CACHESIZE=${mongo.mongodb_cache_size}
SYNCDELAY=${mongo.mongodb_syncdelay}
EVICTION_DIRTY_TRIGGER=${mongo.mongodb_eviction_dirty_trigger}
EVICTION_DIRTY_TARGET=${mongo.mongodb_eviction_dirty_target}
EVICTION_THREADS_MIN=${mongo.mongodb_eviction_threads_min}
EVICTION_THREADS_MAX=${mongo.mongodb_eviction_threads_max}
EVICTION_TRIGGER=${mongo.mongodb_eviction_trigger}
EVICTION_TARGET=${mongo.mongodb_eviction_target}
USE_NOATIME=${mongo.mongodb_datafs_use_noatime}
# Here we have to remount the disk mongodb uses for data, to take advantage of the USE_NOATIME parameter
sudo service mongod stop
sudo umount /mnt/mongodb
if [ "$USE_NOATIME" = true ]; then
sudo mount /dev/nvme0n1 /mnt/mongodb -o noatime
else
sudo mount /dev/nvme0n1 /mnt/mongodb
fi
sudo service mongod start
# flush logs
echo -n | sudo tee /mnt/mongodb/log/mongod.log
sudo service mongod restart
until grep -q "waiting for connections on port 27017" /mnt/mongodb/log/mongod.log
do
echo "waiting MongoDB..."
sleep 60
done
sleep 5
sudo service prometheus-mongodb-exporter restart
# set knobs
mongo --quiet --eval "db.adminCommand({setParameter:1, 'wiredTigerEngineRuntimeConfig': 'cache_size=${CACHESIZE}m, eviction=(threads_min=$EVICTION_THREADS_MIN,threads_max=$EVICTION_THREADS_MAX), eviction_dirty_trigger=$EVICTION_DIRTY_TRIGGER, eviction_dirty_target=$EVICTION_DIRTY_TARGET', eviction_trigger=$EVICTION_TRIGGER, eviction_target=$EVICTION_TARGET})"
mongo --quiet --eval "db = db.getSiblingDB('admin'); db.runCommand({ setParameter : 1, syncdelay: $SYNCDELAY})"
sleep 3Shell
We can add a workflow task that actually executes the MongoDB configuration script produced by the FileConfigurator:
name:launch mongooperator:Executorarguments:command:bash /home/myusername/mongo/launcher.shcomponent:mongo# we can take all the ssh connection parameters from the properties of the mongo component
In each task, we leveraged the reference to the "mongo" component to fetch from its properties all the authentication info to ssh into the right machine e let the FileConfigurator and Executor do their work
Test the performance of the application
We can define a workflow task that uses the Executor operator to launch the YCSB benchmark against MongoDB:
name:launch ycsboperator:Executorarguments:command:bash /home/myusername/ycsb/launch_load.shcomponent:ycsb# we can take all the ssh connection parameters from the properties of the ycsb component
Here’s an example of a launch script for YCSB:
#!/bin/bash
MONGODB_SERVER_IP="mongo.mycompany.com"
RECORDCOUNT=30000000
RUN_THREADS=10
LOAD_THREADS=10
DURATION=1800 # 30 minutes
WORKLOAD="a"
cd "$(dirname "$0")" || exit
# here we use the db_records file to check if we have already loaded the db with data
# if not we run a load script
db_records=$(cat db_records)
if [ "$RECORDCOUNT" != "$db_records" ]; then
bash scripts/create_db_mongo.sh ${MONGODB_SERVER_IP} "$RECORDCOUNT" "$LOAD_THREADS" "$WORKLOAD"
echo "$RECORDCOUNT" > db_records
fi
cd /home/myuser/ycsb-0.15.0 || exit
# launch task in background
./bin/ycsb run mongodb-async -s -P workloads/workload"$WORKLOAD" -threads "$RUN_THREADS" -p recordcount="$RECORDCOUNT" -p operationcount=0 -p maxexecutiontime="$DURATION" -p mongodb.url=mongodb://"$MONGODB_SERVER_IP":27017 &> /home/myuser/ycsb/outputRun.txt &
PID=$!
while kill -0 "$PID" >/dev/null 2>&1; do
echo running
if grep -q "java.net.ConnectException: Connection refused (Connection refused)" /home/myuser/ycsb/outputRun.txt; then
echo "No connection, killing time!"
echo -n > /home/myuser/ycsb/outputRun.txt
ps -ef | grep -i com.yahoo.ycsb.Client | awk '{print $2}' | xargs -I{} kill -9 {}
echo "Let's wait sometime... maybe Mongo is recovering data??"
sleep 900
exit 1
fi
if grep -Fxq "Could not create a connection to the server" /home/myuser/ycsb/outputRun.txt; then
echo "Unable to connect to server!"
kill -9 ${PID}
rm /home/myuser/ycsb/outputRun.txt
rm /home/myuser/ycsb/db_records
else
sleep 10
fi
done
Prepare test results
We can define a workflow task that launches a script that parses the YCSB results into a CSV file (Akamas will process the CSV file and then extract performance test metrics):
name:parse ycsb resultsoperator:Executorarguments:command:python /home/myusername/ycsb/parser.pycomponent:ycsb# we can take all the ssh connection parameters from the properties of the ycsb component
Complete workflow
By putting together all the tasks defined above we come up with the following workflow definition (workflow.yaml):
name:mongo workflowtasks: - name:configure mongooperator:FileConfiguratorarguments: sourcePath: /home/myusername/mongo/templates/mongo_launcher.sh.templ # MongoDB configuration script with placeholders for Akamas parameters
targetPath:/home/myusername/mongo/launcher.sh# configuration script with interpolated Akamas parameterscomponent:mongo# we can take all the ssh connection parameters from the properties of the mongo component - name:launch mongooperator:Executorarguments:command:bash /home/myusername/mongo/launcher.shcomponent:mongo# mongo should match the component of your system that represents your MongoDB instance - name:launch ycsboperator:Executorarguments:command:bash /home/myusername/ycsb/launch_load.shcomponent:ycsb# we can take all the ssh connection parameters from the properties of the ycsb component - name:parse ycsb resultsoperator:Executorarguments:command:python /home/myuser/ycsb/parser.pycomponent:ycsb# we can take all the ssh connection parameters from the properties of the ycsb component
We can create the workflow by running:
akamas create workflow workflow.yaml
Telemetry
Prometheus
Since we are employing Prometheus to extract MongoDB metrics, we can leverage the Prometheus provider to start ingesting data-points into Akamas. To use the Prometheus provider we need to define a telemetry-instance (prom.yaml):
provider:Prometheus# we are using Prometheusconfig:address:ycsb.mycompany.com# address of Prometheusport:9090metrics: - metric:mongodb_connections_currentdatasourceMetric:mongodb_connections{instance="$INSTANCE$"}labels: - state - metric:mongodb_heap_useddatasourceMetric:mongodb_extra_info_heap_usage_bytes{instance="$INSTANCE$"} - metric:mongodb_page_faults_totaldatasourceMetric:rate(mongodb_extra_info_page_faults_total{instance="$INSTANCE$"}[$DURATION$]) - metric:mongodb_global_lock_current_queuedatasourceMetric:mongodb_global_lock_current_queue{instance="$INSTANCE$"}labels: - type - metric:mongodb_mem_useddatasourceMetric:mongodb_memory{instance="$INSTANCE$"}labels: - type - metric:mongodb_documents_inserteddatasourceMetric:rate(mongodb_metrics_document_total{instance="$INSTANCE$", state="inserted"}[$DURATION$]) - metric:mongodb_documents_updateddatasourceMetric:rate(mongodb_metrics_document_total{instance="$INSTANCE$", state="updated"}[$DURATION$]) - metric:mongodb_documents_deleteddatasourceMetric:rate(mongodb_metrics_document_total{instance="$INSTANCE$", state="deleted"}[$DURATION$]) - metric:mongodb_documents_returneddatasourceMetric:rate(mongodb_metrics_document_total{instance="$INSTANCE$", state="returned"}[$DURATION$])
Notice: the fact that the instance definition contains the specification of Prometheus queries to map to Akamas metrics is temporary. In the next release, these queries will be embedded in Akamas.
By default, $DURATION$ will be replaced with 30s. You can override it to your needs by setting a duration property under prometheus within your mongo component
We can now create the telemetry instance and attach it to our system by running:
Beyond MongoDB metrics, it is important to ingest into Akamas metrics related to the performance tests run with YCSB, in particular the throughput of operations. To achieve this we can leverage the CSV Provider which parses a CSV file to extract relevant metrics. The CSV file we are going to parse with the help of the provider is the one produced in the last task of the workflow of the study.
To start using the provider, we need to define a telemetry instance (csv.yaml):
provider:CSVconfig:protocol:scpaddress:ycsb.mycompany.comusername:myuserauthType:keyauth:... RSA KEY ...remoteFilePattern:/home/ubuntu/ycsb/output.csvcsvFormat:horizontalcomponentColumn:ComponenttimestampColumn:timestamptimestampFormat:yyyy-MM-dd HH:mm:ssmetrics:# here we put which metric found in the csv provider should be mapped to which akamas metrics# we are only interested in the throughput, but you can add other metrics if you want - metric:throughputdatasourceMetric:throughput....
We can create the telemetry instance and attach it to our system by running:
It is important that the throughput of our MongoDB instance should be considered valid only when it is stable, for this reason, we can use the stability windowing policy. This policy identifies a period of time with at least 100 samples with a standard deviation lower than 200 when the application throughput is maximum.
windowing:type:stabilitystability:# measure the goal function where the throughput has stdDev <= 200 for 100 consecutive data pointsmetric:throughputlabels:componentName:ycsbwidth:100maxStdDev:200# take only the temporal window when the throughput is maximumwhen:metric:throughputis:maxlabels:componentName:ycsb
Parameters to optimize
We are going to optimize every MongoDB parameter:
parametersSelection: - name:mongo.mongodb_syncdelay - name:mongo.mongodb_eviction_dirty_trigger - name:mongo.mongodb_eviction_dirty_target - name:mongo.mongodb_eviction_target - name:mongo.mongodb_eviction_trigger - name:mongo.mongodb_eviction_threads_min - name:mongo.mongodb_eviction_threads_max - name:mongo.mongodb_cache_size # here we have changed the domain of the cache size since we suppose our mongo.mycompany.com host has 32gb of RAM, you should adapt to your own instance
domain: [500,32000]
Steps
We are going to add to our study two steps:
A baseline step, in which we set a cache size of 1GB and use the default values for all the other MongoDB parameters
An optimize step, in which we perform 100 experiments to generate the best configuration for MongoDB
Here’s what these steps look like:
steps:- name:baselinetype:baselinevalues:mongo.mongodb_cache_size:1024renderParameters:# use also all the other MongoDB parameters at their default value - mongo.*- name:optimize mongotype:optimizenumberOfExperiments:100
Complete study
Here’s the study definition (study.yaml) for optimizing MongoDB:
name:study to tune MongoDBdescription:study to tune MongoDB with YCSB perf testsystem:mongodb systemworkflow:mongo workflow# Goalgoal:objective:maximizefunction:formula:ycsb.throughput# Windowingwindowing:type:stabilitystability:metric:throughputlabels:componentName:ycsbwidth:100maxStdDev:200when:metric:throughputis:maxlabels:componentName:ycsb# parameters selectionparametersSelection: - name:mongo.mongodb_syncdelay - name:mongo.mongodb_eviction_dirty_trigger - name:mongo.mongodb_eviction_dirty_target - name:mongo.mongodb_eviction_target - name:mongo.mongodb_eviction_trigger - name:mongo.mongodb_eviction_threads_min - name:mongo.mongodb_eviction_threads_max - name:mongo.mongodb_cache_size# here we have changed the domain of the cache size since we suppose our mongo.mycompany.com host has 32gb of RAMdomain: [500,32000] - name:mongo.mongodb_datafs_use_noatimeparameterConstraints:- name:c1formula:mongo.mongodb_eviction_threads_min <= mongo.mongodb_eviction_threads_max- name:c2formula:mongodb_eviction_dirty_target <= mongodb_eviction_target- name:c3formula:mongodb_eviction_dirty_trigger <= mongodb_eviction_trigger# stepssteps:- name:baselinetype:baselinevalues:mongo.mongodb_cache_size:1024renderParameters:# use also all the other MongoDB parameters at their default value - mongo.*- name:optimize mongotype:optimizenumberOfExperiments:100