Speeding up puppet runs by using checksums when running execs

Posted on vr 20 maart 2015 in software

During my integration work for a client, I was running third party puppet code to integrate automatically deployed application containers with the third party deployment tool (the deployit / XL Deploy module for puppet)

This module was written to run an exec for three different actions per container, all resulting in running Jython code in a local Java JVM. It would result in a couple of API / REST calls on the central deployment server to register this particular container.

Running this code every 30 minutes on 40+ hosts was dead slow and it was pounding the central server receiving the calls. Al while the properties being transferred had not changed and did not need any resending. Ugh 🙁

Kinda hoping that the vendor would come up with something smart, this situation was there for some time. But about a week ago I found the time to create an alternative. Instead of using the third party module, I wrote a shell scripts that parses a puppet managed properties file.

The script starts by reading all the properties and checksumming the concatenated result. If the sha256 sum has not c hanged since the last run, the script ends there. If there is no old checksum or the checksum does not match, the original Jython running in a JVM doing a rest call thing is run.

Before I started writing this alternative, puppet runs would easily run for minutes and minutes, waiting for the execs to finish. Now, with the central server offloaded, the typical puppet run is reduced to 80 seconds when running the execs and 29 seconds when the checksums matches.

[root@example ~]# puppet agent -t
Info: Retrieving plugin
Info: Loading facts in /var/opt/lib/pe-puppet/lib/facter/pe_build.rb
Info: Loading facts in /var/opt/lib/pe-puppet/lib/facter/pe_version.rb
Info: Loading facts in /var/opt/lib/pe-puppet/lib/facter/xldeploy_facts.rb
[snip]
Info: Caching catalog for example.local
Info: Applying configuration version '1426664207'
Notice: /Stage[main]/app1::Tomcat/my_tomcat::Instance[app1-service]/Exec[xldeploy-register-tomcat-app1-service]/returns: executed successfully
Notice: /Stage[main]/app2::Tomcat/my_tomcat::Instance[app2]/Exec[xldeploy-register-tomcat-app2]/returns: executed successfully
Notice: /Stage[main]/app3::Tomcat/my_tomcat::Instance[app3-service]/Exec[xldeploy-register-tomcat-app3-service]/returns: executed successfully
Notice: /Stage[main]/app4::Tomcat/my_tomcat::Instance[app4]/Exec[xldeploy-register-tomcat-app4]/returns: executed successfully
Notice: Finished catalog run in 28.85 seconds
[root@example ~]#

Xldeploy register script

#!/bin/bash
#
#
# This script can be used as an alternative to the slow puppet resources for
# XL Deploy. Use puppet to fill a property file for your XL Deploy container
# instance and call this script with the property file as argument.
#
# When it needs to register stuff in XL Deploy it is still slow as hell
# (the XL Deploy cli is just slow), but what this script does is test if
# the properties have changed since the last run and skip everything when
# nothing seems to need any updating. Should save as tons of time during
# puppet runs AND should shave us countless REST calls on the XL Deploy
# instance.
#
# 12-mar-2015
#

BASEDIR=/opt/dummy
VARDIR=/tmp

XLD_HOST=xldeploy.example.com
XLD_USER=admin
XLD_PASS=<%= @xldeploy_admin_password %>

export DEPLOYIT_CLI_HOME=/opt/deployit-cli
export DEPLOYIT_CLI_OPTS="-Xmx512m"

if [[ "X$1" == "X" ]];
then
echo "Usage $0 [propertie file]"
exit 1;
fi

if [[ ! -r $1 ]];
then
echo "Unable to read propertie file $1"
exit 2;
else
EXT_PROPERTIES=$1
fi

function report()
{
logger "$1"
echo "$1"
}

function get_property()
{
KEYWORD=$1
VALUE=`/bin/grep $KEYWORD $EXT_PROPERTIES | /bin/cut -f1 -d '=' --complement`
echo $VALUE
}

## run a command using the CLI. If it fails, remove any existing checksum to force
## a retry during the next run of this script.
function run_xld_command()
{
COMMAND=$1
/opt/deployit-cli/bin/cli.sh -q -host $XLD_HOST -port 443 -secure -context /deployit/deployit -username $XLD_USER -password $XLD_PASS $COMMAND
if [[ $? > 0  && -f $CHECKSUM ]]; then rm $CHECKSUM; fi
}

function create_ci()
{
CI_ID=$1
CI_TYPE=$2
CI_PARAMS=$3
run_xld_command "-f /opt/deployit-puppet-module/create-ci.py -- $CI_ID $CI_TYPE $CI_PARAMS"
}

function set_tags()
{
CI_ID=$1
CI_TAG=$2
run_xld_command "-f /opt/deployit-puppet-module/set-tags.py -- $CI_ID $CI_TAG"
}

function set_environment()
{
CI_ID=$1
CI_ENVIRONMENT=$2
run_xld_command "-f /opt/deployit-puppet-module/set-envs.py -- $CI_ID $CI_ENVIRONMENT"
}

# The function below checks if anything in the XL Deploy settings for this
# CI has changed since the last run. If anything has changed the checksum on
# disk is updated (and the caller should rerun the command on the XL Deploy
# server to update this CI.
function xlconfig_changed()
{
CHECK_STRING=$1
CURRENT_SUM=`echo $CHECK_STRING | /usr/bin/sha256sum | /bin/awk '{ print $1; }'`
if [[ -r $CHECKSUM ]];
then
PREVIOUS_SUM=`cat $CHECKSUM`
else
PREVIOUS_SUM="none"
fi

if [[ $PREVIOUS_SUM != $CURRENT_SUM ]];
then
CHANGED=true
logger "Checksum $CURRENT_SUM does not match $PREVIOUS_SUM"
echo $CURRENT_SUM > $CHECKSUM
else
CHANGED=false
logger "Checksum $CURRENT_SUM same as $PREVIOUS_SUM"
fi

echo $CHANGED
}


HOST_CI_ID=$(get_property "host.ci.id")
HOST_CI_TYPE=$(get_property "host.ci.type")
HOST_CI_TAG=$(get_property "host.ci.tag")
HOST_CI_PARAMS=$(get_property "host.ci.params")

CONTAINER_CI_ID=$(get_property "container.ci.id")
CONTAINER_CI_TYPE=$(get_property "container.ci.type")
CONTAINER_CI_TAG=$(get_property "container.ci.tag")
CONTAINER_CI_PARAMS=$(get_property "container.ci.params")

CI_ENVIRONMENT=$(get_property "ci.environment")

CHECKSUM=$(get_property "checksum.path")

UPDATE_NEEDED=$(xlconfig_changed "$HOST_CI_ID $HOST_CI_TYPE $HOST_CI_TAG $HOST_CI_PARAMS $CONTAINER_CI_ID $CONTAINER_CI_TYPE $CONTAINER_CI_TAG $CONTAINER_CI_PARAMS $CI_ENVIRONMENT")

if [[ $UPDATE_NEEDED == "true" ]];
then
report "Running the java/python code to update this CI in the XL Deploy server"
create_ci "$HOST_CI_ID" "$HOST_CI_TYPE" "$HOST_CI_PARAMS"
create_ci "$CONTAINER_CI_ID" "$CONTAINER_CI_TYPE" "$CONTAINER_CI_PARAMS"
set_tags "$HOST_CI_ID" "$HOST_CI_TAG"
set_tags "$CONTAINER_CI_ID" "$CONTAINER_CI_TAG"
set_environment "$HOST_CI_ID" "$CI_ENVIRONMENT"
set_environment "$CONTAINER_CI_ID" "$CI_ENVIRONMENT"
else
report "Nothing to do here"
fi

exit 0

Example properties file

# managed by puppet
#
host.ci.id=<%= @host_ci_id %>
host.ci.type=<%= @host_ci_type %>
host.ci.tag=<%= @host_ci_tag %>
host.ci.params=<%= @host_ci_params %>

container.ci.id=<%= @container_ci_id %>
container.ci.type=<%= @container_ci_type %>
container.ci.tag=<%= @container_ci_tag %>
container.ci.params=<%= @container_ci_params %>

ci.environment=<%= @ci_environment %>
checksum.path=<%= @checksum_path %>

This code is propably way to specific to be of any use as is, but maybe the checksumming solution for ‘expensive’ exec calls during puppet runs is applicable to other scenarios as well.