Description
Component(s)
receiver/splunkhec
What happened?
Description
I am experiencing a log truncation issue when using OpenTelemetry Collector to forward Jenkins job logs to Splunk. The truncation happens at a specific location in the logs, and the log cuts off after the line with git fetch ....
If I manually add a character at the end of the log line (for example, at the end of the "git fetch ..." line), the full log appears correctly. However, the truncation issue persists if no changes are made to the log.
Steps to Reproduce
Set up OpenTelemetry Collector to forward Jenkins job logs to Splunk.
Use the following configuration for filelog receiver and exporters:
filelog/jenkins:
include:
- /var/lib/jenkins/jobs//builds//log
include_file_name: false
include_file_path: true
operators:
- from: attributes["log.file.path"]
to: resource["com.splunk.source"]
type: move
- type: add
field: resource["com.splunk.sourcetype"]
value: jenkins
- type: add
field: resource["com.splunk.index"]
value: {{ .project_code }}_app
- type: add
field: resource["environment"]
value: xxx
- type: add
field: resource["aws_account"]
value: xxx
- type: regex_parser
regex: (?P[\s\S]*)
Observe the truncation in the Jenkins log, specifically after the git fetch line.
Expected Result
The full Jenkins job log should be forwarded without truncation, especially for long logs with git operations.
Started by timer
Running as SYSTEM
[EnvInject] - Loading node environment variables.
Building in workspace /var/lib/jenkins/workspace/DB-CheckDupDNS
The recommended git tool is: NONE
using credential b02d020e-f21a-4c22-a239-85b043c1fe38
git rev-parse --resolve-git-dir /var/lib/jenkins/workspace/DB-CheckDupDNS/.git # timeout=10
Fetching changes from the remote Git repository
git config remote.origin.url https://gitlab.xxxxxxx.net/xxxxxxx/database.git # timeout=10
Fetching upstream changes from https://gitlab.xxxxxxx.net/xxxxxxx/database.git
git --version # timeout=10
git --version # 'git version 2.47.1'
using GIT_ASKPASS to set credentials svc-snt-jenkins
git fetch --tags --force --progress -- https://gitlab.xxxxxxx.net/xxxxxxx/database.git +refs/heads/:refs/remotes/origin/ # timeout=10
git rev-parse origin/feds-uplift-master^{commit} # timeout=10
Checking out Revision f69c5b0060e411e59199da3ea710af7e92cb766c (origin/feds-uplift-master)
git config core.sparsecheckout # timeout=10
git checkout -f f69c5b0060e411e59199da3ea710af7e92cb766c # timeout=10
Commit message: "Merge branch 'master' into 'feds-uplift-master'"
git rev-list --no-walk f69c5b0060e411e59199da3ea710af7e92cb766c # timeout=10
No emails were triggered.
[EnvInject] - Inject global passwords.
[EnvInject] - Mask passwords that will be passed as build parameters.
[DB-CheckDupDNS] $ /bin/sh -xe /tmp/jenkins14996528112343886716.sh
- export REQUESTS_CA_BUNDLE=/etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem
- REQUESTS_CA_BUNDLE=/etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem
++ tail -n 1
++ sort -t. -n -k2,2
++ grep '[0-9]$'
++ ls /usr/bin/python3.11 /usr/bin/python3.9 /usr/bin/python3.9-config /usr/bin/python3.9-x86_64-config - PYTHON_EXE=/usr/bin/python3.11
- /usr/bin/python3.11 /var/lib/jenkins/workspace/DB-CheckDupDNS/DBA/AWS/jenkins/DB-CheckDupDNS/check_dup_dns.py
No emails were triggered.
Finished: SUCCESS
Actual Result
Logs are truncated at a specific location (usually after the git fetch line), and the rest of the log is missing.
Started by timer
Running as SYSTEM
[EnvInject] - Loading node environment variables.
Building in workspace /var/lib/jenkins/workspace/DB-CheckDupDNS
The recommended git tool is: NONE
using credential b02d020e-f21a-4c22-a239-85b043c1fe38
git rev-parse --resolve-git-dir /var/lib/jenkins/workspace/DB-CheckDupDNS/.git # timeout=10
Fetching changes from the remote Git repository
git config remote.origin.url https://gitlab.xxxxxxx.net/xxxxxxx/database.git # timeout=10
Fetching upstream changes from https://gitlab.xxxxxxx.net/xxxxxxx/database.git
git --version # timeout=10
git --version # 'git version 2.47.1'
using GIT_ASKPASS to set credentials svc-snt-jenkins
git fetch --tags --force --progress -- https://gitlab.xxxxxxx.net/xxxxxxx/database.git +refs/heads/:refs/remotes/origin/ # timeout=10
*** Adding a character manually at the end of "> git fetch" line fixes the issue temporarily.
Collector version
otelcol --version otelcol version v0.112.0
Environment information
Environment
OS: Amazon Linux 2023
Splunk version: 9.3.1
Jenkins version: 2.452.2
OpenTelemetry Collector configuration
# cat exporters.yaml
splunk_hec/logs:
token: {{ .splunk_hec_token }}
endpoint: https://splunk-hf.{{ .domain }}:443/services/collector/event
retry_on_failure:
enabled: false
max_elapsed_time: 3600
max_interval: 300
tls:
ca_file: {{ .cert }}
splunk_hec/metrics:
token: {{ .splunk_hec_token }}
endpoint: https://splunk-hf.{{ .domain }}:443/services/collector
source: otel
sourcetype: em_metrics
index: {{ .project_code }}_metrics
retry_on_failure:
enabled: false
max_elapsed_time: 3600
max_interval: 300
tls:
ca_file: {{ .cert }}
# cat processors.yaml
batch:
send_batch_max_size: 10000
timeout: 5s
memory_limiter:
check_interval: 1s
limit_percentage: 50
spike_limit_percentage: 30
resource:
attributes:
- key: host.name
value: {{ .host_name }}
action: upsert
metricstransform:
transforms:
- include: ^percent\.(.*)$$
match_type: regexp
action: update
new_name: cpu.$1
operations:
- action: update_label
label: plugin_instance
new_label: cpu
- include: ^percent\_bytes\.(.*)$$
match_type: regexp
action: update
new_name: df.$1
operations:
- action: update_label
label: plugin_instance
new_label: device
- action: add_label
new_label: type
new_value: percent_bytes
- include: ^percent\_inodes\.(.*)$$
match_type: regexp
action: update
new_name: df.$1
operations:
- action: update_label
label: plugin_instance
new_label: device
- action: add_label
new_label: type
new_value: percent_inodes
- include: ^df\_complex\.(.*)$$
match_type: regexp
action: update
new_name: df.$1
operations:
- action: update_label
label: plugin_instance
new_label: device
- action: add_label
new_label: type
new_value: df_complex
- include: ^df\_inodes\.(.*)$$
match_type: regexp
action: update
new_name: df.$1
operations:
- action: update_label
label: plugin_instance
new_label: device
- action: add_label
new_label: type
new_value: df_inodes
- include: ^.*$$
match_type: regexp
action: update
operations:
- action: add_label
new_label: entity_type
new_value: nix_host
- action: add_label
new_label: Tenant
new_value: {{ .project_code }}
- action: add_label
new_label: Release
new_value: FEDS-AmazonLinux-2023-x86_64-v3.49.5029901
- action: add_label
new_label: os
new_value: amazon
- action: add_label
new_label: cloud
new_value: aws
# cat receivers.yaml
journald:
directory: /run/log/journal
priority: info
operators:
- type: add
field: resource["com.splunk.index"]
value: {{ .project_code }}_linux
- type: add
field: resource["com.splunk.sourcetype"]
value: journald
- type: add
field: resource["com.splunk.source"]
value: "/run/log/journal"
filelog/system:
include:
- /Library/Logs/*
- /var/log/*.log*
- /var/log/*log
- /var/log/*messages*
- /var/log/*secure*
- /var/log/*auth*
- /var/log/*mesg
- /var/log/*cron
- /var/log/*acpid
- /var/log/*.out*
- /var/adm/*.log*
- /var/adm/*log
- /var/adm/*messages*
- /etc/*.conf*
- /etc/*.cfg*
- /etc/*config
- /etc/*.ini*
- /etc/*.init*
- /etc/*.cf*
- /etc/*.cnf*
- /etc/*shrc
- /etc/ifcfg*
- /etc/*.profile*
- /etc/*.rc*
- /etc/*.rules*
- /etc/*.tab*
- /etc/*tab
- /etc/*.login*
- /etc/*policy
# - /root/.bash_history
# - /home/*/.bash_history
exclude:
- /var/log/*lastlog*
- /var/log/*anaconda.syslog*
include_file_name: false
include_file_path: true
operators:
- from: attributes["log.file.path"]
to: resource["com.splunk.source"]
type: move
- type: add
field: resource["com.splunk.sourcetype"]
value: linux
- type: add
field: resource["com.splunk.index"]
value: {{ .project_code }}_linux
- type: regex_parser
regex: (?P<timestamp_field>^\w{3}\s\d{1,2}\s\d{1,2}\:\d{2}\:\d{2})
timestamp:
layout: "%b %d %H:%M:%S"
parse_from: attributes.timestamp_field
on_error: send_quiet
filelog/rkhunter:
include:
- /var/log/rkhunter/rkhunter.log
include_file_name: false
include_file_path: true
operators:
- from: attributes["log.file.path"]
to: resource["com.splunk.source"]
type: move
- type: add
field: resource["com.splunk.sourcetype"]
value: rkhunter
- type: add
field: resource["com.splunk.index"]
value: {{ .project_code }}_linux
filelog/mde:
include:
- /var/log/microsoft/mdatp/*.log
include_file_name: false
include_file_path: true
operators:
- from: attributes["log.file.path"]
to: resource["com.splunk.source"]
type: move
- type: add
field: resource["com.splunk.sourcetype"]
value: ms:defender:atp:system_logs
- type: add
field: resource["com.splunk.index"]
value: {{ .project_code }}_clamav
- type: regex_parser
regex: \[\d+\]\[(?P<timestamp_field>\d{4}\-\d{2}\-\d+\s\d+\:\d{2}\:\d{2}\.\d+\s\w+)\]
timestamp:
layout: "%Y-%m-%d %H:%M:%S %Z"
parse_from: attributes.timestamp_field
on_error: send_quiet
collectd:
endpoint: 'localhost:8081'
hostmetrics:
collection_interval: 5m
scrapers:
process:
filelog/jenkins:
include:
- /var/lib/jenkins/jobs/*/builds/*/log
include_file_name: false
include_file_path: true
operators:
- from: attributes["log.file.path"]
to: resource["com.splunk.source"]
type: move
- type: add
field: resource["com.splunk.sourcetype"]
value: jenkins
- type: add
field: resource["com.splunk.index"]
value: {{ .project_code }}_app
- type: add
field: resource["environment"]
value: xxx
- type: add
field: resource["aws_account"]
value: xxx
- type: regex_parser
regex: (?P<message>[\s\S]*)
Log output
# cat /var/log/messages | grep otel |grep CheckDupDNS |grep 82058
Mar 24 18:30:00 ip-100-64-9-9 otelcol[2754516]: 2025-03-24T18:30:00.887Z#011info#011fileconsumer/file.go:265#011Started watching file#011{"kind": "receiver", "name": "filelog/jenkins", "data_type": "logs", "component": "fileconsumer", "path": "/var/lib/jenkins/jobs/DB-CheckDupDNS/builds/82058/log"}
Mar 24 18:30:00 ip-100-64-9-9 otelcol[2754516]: 2025-03-24T18:30:00.887Z#011info#011fileconsumer/file.go:265#011Started watching file#011{"kind": "receiver", "name": "filelog/jenkins", "data_type": "logs", "component": "fileconsumer", "path": "/var/lib/jenkins/jobs/DB-CheckDupDNS/builds/82058/log"}
Additional context
No response