Skip to content

Fix benchmark failures and improve results quality #8962

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 56 commits into from
Jun 18, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
db10cbc
fix: Add CPU affinity to load benchmarks
ddyurchenko Jun 11, 2025
7245a76
wip: Force trigger Gitlab CI
ddyurchenko Jun 11, 2025
20f1d6a
wip: Generalize affinities, export k6 results, change executor
ddyurchenko Jun 12, 2025
6ce4ca6
wip: Save backup of logs
ddyurchenko Jun 12, 2025
06cf0bb
wip: Reduce execution time
ddyurchenko Jun 12, 2025
5edc665
wip: Tweak duration, add warmup stage, backup logs
ddyurchenko Jun 12, 2025
85a9dee
fix: backup-reports location
ddyurchenko Jun 12, 2025
2d59094
fix: Fix k6 config
ddyurchenko Jun 12, 2025
76060dc
fix: Fix k6 scenario names
ddyurchenko Jun 12, 2025
02de582
fix: Fix k6 scenario names
ddyurchenko Jun 12, 2025
391519f
fix: Change k6 names format
ddyurchenko Jun 13, 2025
8ba2eee
fix: Increase warmup time
ddyurchenko Jun 13, 2025
3e9ec83
fix: Reduce gracefulStop
ddyurchenko Jun 13, 2025
987daf6
fix: Shift start time
ddyurchenko Jun 13, 2025
4ec4ad7
fix: Fix local runs, reduce warmup duration
ddyurchenko Jun 13, 2025
6b085c9
fix: Change debug logs location
ddyurchenko Jun 13, 2025
4654c1a
tweak: CPU affinities to make sure bottleneck is on k6 side
ddyurchenko Jun 16, 2025
c71da3e
tweak: Output Java version
ddyurchenko Jun 16, 2025
b137845
test: Use constant-arrival-rate executor
ddyurchenko Jun 16, 2025
4ea19a0
tweak: Increase warmup time for petclinic
ddyurchenko Jun 16, 2025
28de22a
fix: Fix mistake in startTime
ddyurchenko Jun 16, 2025
4cbcacc
tweak: Reduce RPS to compensate for loss on profiling petclinic variant
ddyurchenko Jun 16, 2025
238ea6a
tweak: Parallelize load tests
ddyurchenko Jun 13, 2025
e82f1e4
tweak: Parallelize insecure-bank, tweak logs storage
ddyurchenko Jun 13, 2025
660cf7c
fix: Fix run script location
ddyurchenko Jun 13, 2025
a0cb015
tweak: Increase iterations count
ddyurchenko Jun 13, 2025
4d2fbd6
fix: Fix paths
ddyurchenko Jun 17, 2025
077c92d
wip: Remove sirun, add verbosity
ddyurchenko Jun 17, 2025
c61647a
wip: Fix bash error
ddyurchenko Jun 17, 2025
82bd4f5
wip: Fix bash error
ddyurchenko Jun 17, 2025
cfa13c0
wip: Fix bash error
ddyurchenko Jun 17, 2025
86fc41f
wip: Fix bash error
ddyurchenko Jun 17, 2025
4c0c13a
wip: Fix bash error
ddyurchenko Jun 17, 2025
7fa0083
wip: Fix bash error
ddyurchenko Jun 17, 2025
3dd9879
wip: Add more log lines
ddyurchenko Jun 17, 2025
b5268bb
wip: Add more log lines
ddyurchenko Jun 17, 2025
cd930aa
wip: Fix REPORTS_DIR
ddyurchenko Jun 17, 2025
9469599
wip: Fix k6.js
ddyurchenko Jun 17, 2025
b17c403
wip: Fix env vars
ddyurchenko Jun 17, 2025
50dbe86
wip: Fix k6.js
ddyurchenko Jun 17, 2025
365255e
wip: Fix k6.js & cleanup
ddyurchenko Jun 17, 2025
7cbbd51
wip: Reduce verbosity for load apps
ddyurchenko Jun 17, 2025
b799d72
wip: Disable unbound var check
ddyurchenko Jun 17, 2025
0746b0c
wip: Fix pid collection
ddyurchenko Jun 17, 2025
df980c3
wip: Fix pid collection
ddyurchenko Jun 17, 2025
8aa0cca
Revert "wip: Fix env vars"
ddyurchenko Jun 17, 2025
ab054eb
wip: Separate LOGS_DIR
ddyurchenko Jun 17, 2025
b23b4dc
wip: Reduce verbosity
ddyurchenko Jun 17, 2025
b137583
tweak: k6 config, multiple repetitions
ddyurchenko Jun 17, 2025
72502fb
tweak: multiple repetitions
ddyurchenko Jun 17, 2025
1df8515
tweak: Update BP branch, multiple repetitions
ddyurchenko Jun 17, 2025
db5d3e1
tweak: Tweak RPS for petclinic
ddyurchenko Jun 17, 2025
e4eced2
tweak: wait for health check
ddyurchenko Jun 17, 2025
8e78e94
tweak: Tweak RPS
ddyurchenko Jun 17, 2025
c40ca32
tweak: Increase warmup time
ddyurchenko Jun 18, 2025
e3de508
tweak: Remove unused variants in insecure-bank, tidy k6 .sh commands
ddyurchenko Jun 18, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .gitlab/benchmarks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
script:
- export ARTIFACTS_DIR="$(pwd)/reports" && mkdir -p "${ARTIFACTS_DIR}"
- git config --global url."https://gitlab-ci-token:${CI_JOB_TOKEN}@gitlab.ddbuild.io/DataDog/".insteadOf "https://github.com/DataDog/"
- git clone --branch dd-trace-java/tracer-benchmarks https://github.com/DataDog/benchmarking-platform.git /platform && cd /platform
- git clone --branch dd-trace-java/tracer-benchmarks-parallel https://github.com/DataDog/benchmarking-platform.git /platform && cd /platform
artifacts:
name: "reports"
paths:
Expand Down
4 changes: 0 additions & 4 deletions benchmark/benchmarks.sh
Original file line number Diff line number Diff line change
Expand Up @@ -33,10 +33,6 @@ if [[ ! -f "${TRACER}" ]]; then
cd "${SCRIPT_DIR}"
fi

# Cleanup previous reports
rm -rf "${REPORTS_DIR}"
mkdir -p "${REPORTS_DIR}"

if [[ "$#" == '0' ]]; then
for type in 'startup' 'load' 'dacapo'; do
run_benchmarks "$type"
Expand Down
40 changes: 0 additions & 40 deletions benchmark/load/insecure-bank/benchmark.json

This file was deleted.

62 changes: 53 additions & 9 deletions benchmark/load/insecure-bank/k6.js
Original file line number Diff line number Diff line change
@@ -1,18 +1,62 @@
import http from 'k6/http';
import {checkResponse, isOk, isRedirect} from "../../utils/k6.js";

const baseUrl = 'http://localhost:8080';
const variants = {
"no_agent": {
"APP_URL": 'http://localhost:8080',
},
"tracing": {
"APP_URL": 'http://localhost:8081',
},
"profiling": {
"APP_URL": 'http://localhost:8082',
},
"iast": {
"APP_URL": 'http://localhost:8083',
},
"iast_GLOBAL": {
"APP_URL": 'http://localhost:8084',
},
"iast_FULL": {
"APP_URL": 'http://localhost:8085',
},
}

export const options = function (variants) {
let scenarios = {};
for (const variant of Object.keys(variants)) {
scenarios[`load--insecure-bank--${variant}--warmup`] = {
executor: 'constant-vus', // https://grafana.com/docs/k6/latest/using-k6/scenarios/executors/#all-executors
vus: 5,
duration: '20s',
gracefulStop: '2s',
env: {
"APP_URL": variants[variant]["APP_URL"]
}
};

scenarios[`load--insecure-bank--${variant}--high_load`] = {
executor: 'constant-vus',
vus: 5,
startTime: '22s',
duration: '15s',
gracefulStop: '2s',
env: {
"APP_URL": variants[variant]["APP_URL"]
}
};
}

export const options = {
discardResponseBodies: true,
vus: 5,
iterations: 40000
};
return {
discardResponseBodies: true,
scenarios,
}
}(variants);

export default function () {

// login form
const loginResponse = http.post(`${baseUrl}/login`, {
const loginResponse = http.post(`${__ENV.APP_URL}/login`, {
username: 'john',
password: 'test'
}, {
Expand All @@ -21,11 +65,11 @@ export default function () {
checkResponse(loginResponse, isRedirect);

// dashboard
const dashboard = http.get(`${baseUrl}/dashboard`);
const dashboard = http.get(`${__ENV.APP_URL}/dashboard`);
checkResponse(dashboard, isOk);

// logout
const logout = http.get(`${baseUrl}/j_spring_security_logout`, {
const logout = http.get(`${__ENV.APP_URL}/j_spring_security_logout`, {
redirects: 0
});
checkResponse(logout, isRedirect);
Expand Down
28 changes: 28 additions & 0 deletions benchmark/load/insecure-bank/start-servers.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
#!/usr/bin/env bash

set -e

start_server() {
local VARIANT=$1
local JAVA_OPTS=$2

if [ -n "$CI_JOB_TOKEN" ]; then
# Inside BP, so we can assume 24 CPU cores available and set CPU affinity
CPU_AFFINITY_APP=$3
else
CPU_AFFINITY_APP=""
fi

mkdir -p "${LOGS_DIR}/${VARIANT}"
${CPU_AFFINITY_APP}java ${JAVA_OPTS} -Xms3G -Xmx3G -jar ${INSECURE_BANK} &> ${LOGS_DIR}/${VARIANT}/insecure-bank.log &PID=$!
echo "${CPU_AFFINITY_APP}java ${JAVA_OPTS} -Xms3G -Xmx3G -jar ${INSECURE_BANK} &> ${LOGS_DIR}/${VARIANT}/insecure-bank.log [PID=$PID]"
}

start_server "no_agent" "-Dserver.port=8080" "taskset -c 47 " &
start_server "tracing" "-javaagent:${TRACER} -Dserver.port=8081" "taskset -c 46 " &
start_server "profiling" "-javaagent:${TRACER} -Ddd.profiling.enabled=true -Dserver.port=8082" "taskset -c 45 " &
start_server "iast" "-javaagent:${TRACER} -Ddd.iast.enabled=true -Dserver.port=8083" "taskset -c 44 " &
start_server "iast_GLOBAL" "-javaagent:${TRACER} -Ddd.iast.enabled=true -Ddd.iast.context.mode=GLOBAL -Dserver.port=8084" "taskset -c 43 " &
start_server "iast_FULL" "-javaagent:${TRACER} -Ddd.iast.enabled=true -Ddd.iast.detection.mode=FULL -Dserver.port=8085" "taskset -c 42 " &

wait
46 changes: 0 additions & 46 deletions benchmark/load/petclinic/benchmark.json

This file was deleted.

58 changes: 51 additions & 7 deletions benchmark/load/petclinic/k6.js
Original file line number Diff line number Diff line change
@@ -1,17 +1,61 @@
import http from 'k6/http';
import {checkResponse, isOk} from "../../utils/k6.js";

const baseUrl = 'http://localhost:8080';
const variants = {
"no_agent": {
"APP_URL": 'http://localhost:8080',
},
"tracing": {
"APP_URL": 'http://localhost:8081',
},
"profiling": {
"APP_URL": 'http://localhost:8082',
},
"appsec": {
"APP_URL": 'http://localhost:8083',
},
"iast": {
"APP_URL": 'http://localhost:8084',
},
"code_origins": {
"APP_URL": 'http://localhost:8085',
}
}

export const options = function (variants) {
const scenarios = {};
for (const variant of Object.keys(variants)) {
scenarios[`load--petclinic--${variant}--warmup`] = {
executor: 'constant-vus', // https://grafana.com/docs/k6/latest/using-k6/scenarios/executors/#all-executors
vus: 5,
duration: '20s',
gracefulStop: '2s',
env: {
"APP_URL": variants[variant]["APP_URL"]
}
};

scenarios[`load--petclinic--${variant}--high_load`] = {
executor: 'constant-vus',
vus: 5,
startTime: '22s',
duration: '15s',
gracefulStop: '2s',
env: {
"APP_URL": variants[variant]["APP_URL"]
}
};
}

export const options = {
discardResponseBodies: true,
vus: 5,
iterations: 80000
};
return {
discardResponseBodies: true,
scenarios,
}
}(variants);

export default function () {

// find owner
const ownersList = http.get(`${baseUrl}/owners?lastName=`);
const ownersList = http.get(`${__ENV.APP_URL}/owners?lastName=`);
checkResponse(ownersList, isOk);
}
28 changes: 28 additions & 0 deletions benchmark/load/petclinic/start-servers.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
#!/usr/bin/env bash

set -e

start_server() {
local VARIANT=$1
local JAVA_OPTS=$2

if [ -n "$CI_JOB_TOKEN" ]; then
# Inside BP, so we can assume 24 CPU cores available and set CPU affinity
CPU_AFFINITY_APP=$3
else
CPU_AFFINITY_APP=""
fi

mkdir -p "${LOGS_DIR}/${VARIANT}"
${CPU_AFFINITY_APP}java ${JAVA_OPTS} -Xms2G -Xmx2G -jar ${PETCLINIC} &> ${LOGS_DIR}/${VARIANT}/petclinic.log &PID=$!
echo "${CPU_AFFINITY_APP}java ${JAVA_OPTS} -Xms2G -Xmx2G -jar ${PETCLINIC} &> ${LOGS_DIR}/${VARIANT}/petclinic.log [PID=$!]"
}

start_server "no_agent" "-Dserver.port=8080" "taskset -c 31-32 " &
start_server "tracing" "-javaagent:${TRACER} -Dserver.port=8081" "taskset -c 33-34 " &
start_server "profiling" "-javaagent:${TRACER} -Ddd.profiling.enabled=true -Dserver.port=8082" "taskset -c 35-36 " &
start_server "appsec" "-javaagent:${TRACER} -Ddd.appsec.enabled=true -Dserver.port=8083" "taskset -c 37-38 " &
start_server "iast" "-javaagent:${TRACER} -Ddd.iast.enabled=true -Dserver.port=8084" "taskset -c 39-40 " &
start_server "code_origins" "-javaagent:${TRACER} -Ddd.code.origin.for.spans.enabled=true -Dserver.port=8085" "taskset -c 41-42 " &

wait
76 changes: 74 additions & 2 deletions benchmark/load/run.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,77 @@
#!/usr/bin/env bash
set -eu

set -e

function message() {
echo "$(date +"%T"): $1"
}

function healthcheck() {
local url=$1

while true; do
if [[ $(curl -fso /dev/null -w "%{http_code}" "${url}") = 200 ]]; then
break
fi
done
}

type=$1

if [ -n "$CI_JOB_TOKEN" ]; then
# Inside BP, so we can assume 24 CPU cores on the second socket available and set CPU affinity
export CPU_AFFINITY_K6="taskset -c 24-27 "
else
export CPU_AFFINITY_K6=""
fi

source "${UTILS_DIR}/update-java-version.sh" 17
"${UTILS_DIR}/run-sirun-benchmarks.sh" "$@"

for app in *; do
if [[ ! -d "${app}" ]]; then
continue
fi

message "${type} benchmark: ${app} started"

export OUTPUT_DIR="${REPORTS_DIR}/${type}/${app}"
mkdir -p ${OUTPUT_DIR}

export LOGS_DIR="${ARTIFACTS_DIR}/${type}/${app}"
mkdir -p ${LOGS_DIR}

# Using profiler variants for healthcheck as they are the slowest
if [ "${app}" == "petclinic" ]; then
HEALTHCHECK_URL=http://localhost:8082
REPETITIONS_COUNT=5
elif [ "${app}" == "insecure-bank" ]; then
HEALTHCHECK_URL=http://localhost:8082/login
REPETITIONS_COUNT=2
else
echo "Unknown app ${app}"
exit 1
fi

for i in $(seq 1 $REPETITIONS_COUNT); do
bash -c "${UTILS_DIR}/../${type}/${app}/start-servers.sh" &

echo "Waiting for serves to start..."
if [ "${app}" == "petclinic" ]; then
for port in $(seq 8080 8085); do
healthcheck http://localhost:$port
done
elif [ "${app}" == "insecure-bank" ]; then
for port in $(seq 8080 8085); do
healthcheck http://localhost:$port/login
done
fi
echo "Servers are up!"

(
cd ${app} &&
bash -c "${CPU_AFFINITY_K6}${UTILS_DIR}/run-k6-load-test.sh 'pkill java'"
)
done

message "${type} benchmark: ${app} finished"
done
Loading