Skip to content
Draft
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
b3d976c
Perf test - try in windows runners
cijothomas Mar 13, 2026
3e40e61
tr
cijothomas Mar 13, 2026
0de8935
dockers
cijothomas Mar 13, 2026
5515bed
newer image
cijothomas Mar 13, 2026
830eb5b
docker can't find image
cijothomas Mar 13, 2026
10be839
python pin
cijothomas Mar 13, 2026
cb4a4e6
lock
cijothomas Mar 13, 2026
cc78f0f
mount
cijothomas Mar 13, 2026
893a504
retry timeout increased
cijothomas Mar 13, 2026
ecc0a48
investigate why docker crash
cijothomas Mar 13, 2026
ddd670c
slink
cijothomas Mar 13, 2026
6a81e0a
run upto 4 cores
cijothomas Mar 13, 2026
309409f
parse errorfix
cijothomas Mar 13, 2026
ec79e08
cleanup validations
cijothomas Mar 13, 2026
21348bb
add 100k rps
cijothomas Mar 13, 2026
3ab878f
no attr in middle
cijothomas Mar 13, 2026
707f0e8
Add batchprocessor to perf tests (#2246)
cijothomas Mar 13, 2026
2230efe
Add resource_attributes support to traffic generator (#2265)
lalitb Mar 13, 2026
ff70ded
fix: SharedReceiver::try_recv maps Empty to Closed causing spurious s…
gouslu Mar 13, 2026
47512bc
AzureMonitor Exporter - nit rename of event attribute to be consisten…
cijothomas Mar 13, 2026
8d41002
[Geneva exporter] Add metrics, and enable telemetry for Geneva export…
lalitb Mar 13, 2026
a93c9c6
chore: Migrate simple processors to core-nodes crate (#2292)
drewrelmas Mar 13, 2026
c769e37
Syslog - add TCP in load tests (#2281)
cijothomas Mar 13, 2026
dc23d4d
fix heartbeat table mappings (#2254)
gouslu Mar 13, 2026
8b15ff4
fix: Add if ${{ !cancelled() }} for batch processor upload (#2320)
JakeDern Mar 13, 2026
8467c4f
chore: Migrate remaining processors to core-nodes crate (#2314)
drewrelmas Mar 13, 2026
6d4ee66
AzMonExporter - simplify auth retry and logging (#2311)
cijothomas Mar 13, 2026
f1b964e
chore(deps): update dependency duckdb to v1.5.0 (#2333)
renovate[bot] Mar 16, 2026
11b1390
chore(deps): update dependency charset-normalizer to v3.4.6 (#2331)
renovate[bot] Mar 16, 2026
dce6686
chore(deps): update azure-sdk-for-rust monorepo to 0.33.0 (#2332)
renovate[bot] Mar 16, 2026
cfd5991
AzMon - add count of network errors (#2324)
cijothomas Mar 16, 2026
932b0a0
chore(deps): update opentelemetry-python monorepo to v1.40.0 (#2337)
renovate[bot] Mar 16, 2026
af0dbcc
azure monitor exporter: optimize transformer with direct JSON seriali…
gouslu Mar 16, 2026
a56e71b
chore(deps): update dependency pydantic-core to v2.42.0 (#2336)
renovate[bot] Mar 16, 2026
d20af79
chore: Migrate topics, internal_telemetry_receiver, and perf_exporter…
drewrelmas Mar 16, 2026
58cd3ce
Optimize bitmap usage in `FilterPipelineStage` (#2329)
albertlockett Mar 16, 2026
90db464
Validation framework testcontainers (#2307)
c1ly Mar 16, 2026
217c5ea
Migrate to new bare metal runner (Ubuntu 24) (#2338)
trask Mar 16, 2026
001e0a1
feat(pdata): implement OtapTracesView for zero-copy Arrow iteration (…
gyanranjanpanda Mar 17, 2026
5fbc2b3
Disable Renovate updates of indirect python lock dependencies (#2351)
drewrelmas Mar 17, 2026
d373b63
fix: Use dedicated channel for engine lifecycle events to prevent sta…
lalitb Mar 17, 2026
08d58b0
chore: Migrate syslog_cef_receiver and parquet_exporter to core-nodes…
drewrelmas Mar 17, 2026
5408545
feat: OTAP Schema construct and complete definitions (#2346)
JakeDern Mar 17, 2026
cf1a0ab
Add configurable payload size for static data source in fake_data_gen…
cijothomas Mar 17, 2026
e54bcb5
[otap-df-telemetry] add duration metric helper for processors (#2211)
jmacd Mar 17, 2026
9d9cc55
use 900 byte size
cijothomas Mar 17, 2026
0a16d33
Merge branch 'main' into cijothomas/perfwin-1
cijothomas Mar 17, 2026
8059e63
2kb
cijothomas Mar 18, 2026
ac2b239
Merge branch 'main' into cijothomas/perfwin-1
cijothomas Mar 18, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
138 changes: 138 additions & 0 deletions .github/workflows/pipeline-perf-test-windows.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
name: Pipeline Performance Tests - Windows
permissions:
contents: read

on:
pull_request:
paths:
- '.github/workflows/pipeline-perf-test-windows.yml'
- 'rust/otap-dataflow/**'
- 'tools/pipeline_perf_test/**'
workflow_dispatch:

jobs:
pipeline-perf-test-windows:
runs-on: windows-latest
steps:
- name: Harden the runner (Audit all outbound calls)
uses: step-security/harden-runner@a90bcbc6539c36a85cdfeb73f7e2f433735f215b # v2.15.0
with:
egress-policy: audit

- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
submodules: true

- name: Free disk space
shell: pwsh
run: |
Write-Host "disk usage before"
Get-PSDrive -PSProvider FileSystem | Select-Object Name, @{Name='Used(GB)';Expression={[math]::Round($_.Used/1GB,2)}}, @{Name='Free(GB)';Expression={[math]::Round($_.Free/1GB,2)}}
if (Test-Path "C:\Android") { Remove-Item -Recurse -Force "C:\Android" }
if (Test-Path "C:\SeleniumWebDrivers") { Remove-Item -Recurse -Force "C:\SeleniumWebDrivers" }
if (Test-Path "C:\imagemagick") { Remove-Item -Recurse -Force "C:\imagemagick" }
Write-Host "disk usage after"
Get-PSDrive -PSProvider FileSystem | Select-Object Name, @{Name='Used(GB)';Expression={[math]::Round($_.Used/1GB,2)}}, @{Name='Free(GB)';Expression={[math]::Round($_.Free/1GB,2)}}

- uses: arduino/setup-protoc@c65c819552d16ad3c9b72d9dfd5ba5237b9c906b # v3.0.0
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}

- name: Start Docker service
shell: pwsh
run: |
Start-Service docker
docker info

- uses: dtolnay/rust-toolchain@efa25f7f19611383d5b0ccf2d1c8914531636bf9
with:
toolchain: stable

- name: Build df_engine.exe
shell: pwsh
working-directory: ./rust/otap-dataflow
run: cargo build --release --features mimalloc
env:
RUSTFLAGS: "-C target-cpu=native -C target-feature=+crt-static"

- name: Build Windows Docker image
shell: pwsh
working-directory: ./rust/otap-dataflow
run: docker build -f Dockerfile.windows -t df_engine_win .

- name: Verify Docker image
shell: pwsh
run: |
docker images df_engine_win
docker run --rm df_engine_win --version

- name: Install Python dependencies
if: ${{ !cancelled() }}
shell: pwsh
run: |
# Use plain requirements.txt (not .lock.txt) on Windows because the
# lock files contain --hash directives that trigger pip hash-checking
# mode, which rejects the Windows-only transitive dep pywin32 (not in
# the Linux-generated lock file).
python -m pip install --user -r tools/pipeline_perf_test/orchestrator/requirements.txt
python -m pip install --user -r tools/pipeline_perf_test/load_generator/requirements.txt

- name: Run idle state test (1 core)
if: ${{ !cancelled() }}
shell: pwsh
run: |
cd tools/pipeline_perf_test
python orchestrator/run_orchestrator.py --config test_suites/integration/nightly-windows/idle-state-docker.yaml

- name: Run idle state test (2 cores)
if: ${{ !cancelled() }}
shell: pwsh
run: |
cd tools/pipeline_perf_test
python orchestrator/run_orchestrator.py --config test_suites/integration/nightly-windows/idle-state-2cores-docker.yaml

- name: Run idle state test (4 cores)
if: ${{ !cancelled() }}
shell: pwsh
run: |
cd tools/pipeline_perf_test
python orchestrator/run_orchestrator.py --config test_suites/integration/nightly-windows/idle-state-4cores-docker.yaml

- name: Analyze memory scaling
if: ${{ !cancelled() }}
shell: pwsh
env:
PYTHONUTF8: "1"
run: |
python .github/workflows/scripts/analyze-idle-state-scaling.py `
tools/pipeline_perf_test/results `
tools/pipeline_perf_test/results/windows-idle-memory-scaling.json `
| Tee-Object -Variable scalingReport
# Add to job summary
echo "### Windows Idle State Memory Scaling Analysis" >> $env:GITHUB_STEP_SUMMARY
echo '```' >> $env:GITHUB_STEP_SUMMARY
$scalingReport >> $env:GITHUB_STEP_SUMMARY
echo '```' >> $env:GITHUB_STEP_SUMMARY

- name: Run 100kLRPS OTLP-OTLP test
if: ${{ !cancelled() }}
shell: pwsh
run: |
cd tools/pipeline_perf_test
python orchestrator/run_orchestrator.py --config test_suites/integration/nightly-windows/100klrps-docker.yaml

- name: Upload idle state results
if: ${{ !cancelled() }}
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7.0.0
with:
name: windows-idle-state-results
path: |
tools/pipeline_perf_test/results/windows_idle_state_*/gh-actions-benchmark/*.json
tools/pipeline_perf_test/results/windows-idle-memory-scaling.json

- name: Upload 100kLRPS results
if: ${{ !cancelled() }}
uses: actions/upload-artifact@bbbca2ddaa5d8feaa63e36b76fdaad77386f024f # v7.0.0
with:
name: windows-100klrps-results
path: tools/pipeline_perf_test/results/windows_100klrps/gh-actions-benchmark/*.json
6 changes: 3 additions & 3 deletions .github/workflows/scripts/analyze-idle-state-scaling.py
Original file line number Diff line number Diff line change
Expand Up @@ -262,11 +262,11 @@ def main():
# Find idle state result directories
memory_data: dict[int, float] = {}

# Look for directories matching idle_state_* pattern
idle_dirs = list(results_base.glob("idle_state_*"))
# Look for directories matching idle_state_* or windows_idle_state_* pattern
idle_dirs = list(results_base.glob("idle_state_*")) + list(results_base.glob("windows_idle_state_*"))

if not idle_dirs:
print("No idle state test results found (looking for idle_state_* directories)", file=sys.stderr)
print("No idle state test results found (looking for *idle_state_* directories)", file=sys.stderr)
sys.exit(0)

print(f"Found {len(idle_dirs)} idle state result directories", file=sys.stderr)
Expand Down
1 change: 1 addition & 0 deletions rust/otap-dataflow/.dockerignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
target
!target/release/df_engine.exe
Cargo.lock
docs
configs
23 changes: 23 additions & 0 deletions rust/otap-dataflow/Dockerfile.windows
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Copyright The OpenTelemetry Authors
# SPDX-License-Identifier: Apache-2.0

# This Dockerfile packages a pre-built df_engine.exe into a Windows container.
# The binary must be built on the host before building this image.
#
# Build steps from the rust/otap-dataflow directory:
# cargo build --release --features mimalloc
# docker build -f Dockerfile.windows -t df_engine_win .
#
# The binary is expected at target/release/df_engine.exe relative to the build
# context (the rust/otap-dataflow directory).
#
# The base image tag must match the Windows version on the host. GitHub Actions
# windows-latest uses Windows Server 2025 which requires ltsc2025.

FROM mcr.microsoft.com/windows/servercore:ltsc2025

WORKDIR C:\\dataflow

COPY target/release/df_engine.exe C:/dataflow/df_engine.exe

ENTRYPOINT ["C:\\dataflow\\df_engine.exe"]
Original file line number Diff line number Diff line change
Expand Up @@ -276,6 +276,35 @@ def stop(self, component: Component, ctx: StepContext):
# Helpers


def _reassemble_drive_letter_parts(parts: List[str]) -> List[str]:
"""Reassemble parts split on ':' that may contain Windows drive letters.

A Windows drive letter is a single alpha character immediately followed
(after the split) by a segment starting with '/' or '\\'.

Examples (after split on ':'):
['host', 'C', '/container', 'ro'] -> ['host', 'C:/container', 'ro']
['C', '/host', 'C', '/container'] -> ['C:/host', 'C:/container']
['host', '/container', 'ro'] -> ['host', '/container', 'ro'] (unchanged)
"""
result: List[str] = []
i = 0
while i < len(parts):
if (
len(parts[i]) == 1
and parts[i].isalpha()
and i + 1 < len(parts)
and parts[i + 1][:1] in ("/", "\\")
):
# Drive letter detected - merge with the following path segment.
result.append(parts[i] + ":" + parts[i + 1])
i += 2
else:
result.append(parts[i])
i += 1
return result


def build_volume_bindings(
volume_mounts: Optional[List[Union[str, DockerVolumeMapping]]],
) -> Dict[str, Dict[str, str]]:
Expand All @@ -291,7 +320,9 @@ def build_volume_bindings(
for vm in volume_mounts:
if isinstance(vm, str):
# Parse string format: /host:/container[:ro|rw]
parts = vm.split(":")
# Also supports Windows drive-letter paths, e.g.
# relative/host:C:/container/path:ro
parts = _reassemble_drive_letter_parts(vm.split(":"))
if len(parts) < 2 or len(parts) > 3:
raise ValueError(f"Invalid volume mount string: '{vm}'")

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -335,20 +335,32 @@ def monitor(
cpu_stats["cpu_usage"]["total_usage"]
- precpu_stats["cpu_usage"]["total_usage"]
)
system_delta = (
cpu_stats["system_cpu_usage"] - precpu_stats["system_cpu_usage"]
)

cpu_usage = 0.0
if system_delta > 0.0 and cpu_delta > 0.0:
num_cpus = (
len(cpu_stats["cpu_usage"].get("percpu_usage", []))
or cpu_stats["online_cpus"]
# Windows containers do not report system_cpu_usage.
# Use a time-based approximation instead.
if "system_cpu_usage" in cpu_stats:
system_delta = (
cpu_stats["system_cpu_usage"]
- precpu_stats["system_cpu_usage"]
)
cpu_usage = (cpu_delta / system_delta) * num_cpus
if system_delta > 0.0 and cpu_delta > 0.0:
num_cpus = (
len(cpu_stats["cpu_usage"].get("percpu_usage", []))
or cpu_stats.get("online_cpus", 1)
)
cpu_usage = (cpu_delta / system_delta) * num_cpus
else:
# Windows: cpu_delta is in 100-nanosecond units.
# Convert to number-of-cores over the poll interval.
num_cpus = cpu_stats.get("online_cpus", 1)
if cpu_delta > 0 and interval > 0:
# 10_000_000 = 100-ns units per second
cpu_usage = cpu_delta / (interval * 10_000_000)

# Memory usage in Bytes
mem_usage = stat_data["memory_stats"]["usage"]
mem_stats = stat_data.get("memory_stats", {})
mem_usage = mem_stats.get("usage", mem_stats.get("privateworkingset", 0))
cpu_usage_gauge.set(cpu_usage, labels)
memory_usage_gauge.set(mem_usage, labels)

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# Template for idle state performance test on Windows Docker containers.
# This template is used to generate both single-core and full-cores tests.
# Differences from the Linux template:
# - Uses df_engine_win:latest (Windows container image)
# - Volume mounts the config directory (Windows containers cannot bind-mount files)
name: Continuous - Idle State Performance (Windows) - {{core_label}}
components:
df-engine:
deployment:
docker:
image: df_engine_win:latest
network: testbed
ports:
- "{{port}}:8080"
volumes:
- 'test_suites/integration/configs/engine:C:/dataflow/config:ro'
command:
- "--config"
- "C:/dataflow/config/config.rendered.yaml"
{% if core_range is defined %}
- "--core-id-range"
- "{{core_range}}"
{% endif %}
- "--http-admin-bind"
- "0.0.0.0:8080"
monitoring:
{% if allocated_cores is defined %}
docker_component:
allocated_cores: {{allocated_cores}}
{% else %}
docker_component: {}
{% endif %}
prometheus:
endpoint: http://localhost:{{port}}/telemetry/metrics?format=prometheus&reset=false

tests:
- name: Idle State Baseline (Windows) - {{core_label}}
steps:
- name: Deploy Dataflow Engine
action:
component_action:
phase: deploy
target: df-engine
hooks:
run:
pre:
- render_template:
template_path: 'test_suites/integration/templates/configs/engine/continuous/otlp-attr-otlp.yaml'
output_path: ./test_suites/integration/configs/engine/config.rendered.yaml
variables:
backend_hostname: localhost
post:
- ready_check_http:
url: http://localhost:{{port}}/telemetry/metrics?reset=false
method: GET
expected_status_code: 200
max_retries: 30
retry_interval: 2

- name: Wait for Startup Stabilization
action:
wait:
delay_seconds: 5
hooks:
run:
pre:
- record_event:
name: stabilization_start
post:
- record_event:
name: stabilization_complete

- name: Monitor Engine
action:
component_action:
phase: start_monitoring
target: df-engine

- name: Observe Idle State
action:
wait:
delay_seconds: 15
hooks:
run:
pre:
- record_event:
name: observation_start
post:
- record_event:
name: observation_stop

- name: Stop Monitoring
action:
component_action:
phase: stop_monitoring
target: df-engine

- name: Destroy Engine
action:
component_action:
phase: destroy
target: df-engine

- name: Run Report
action:
wait:
delay_seconds: 0
hooks:
run:
post:
- print_container_logs: {}
- sql_report:
name: Idle State Performance Report (Windows) - {{core_label}}
report_config_file: ./test_suites/integration/configs/idle_state_report.yaml
output:
- format:
template: {}
destination:
console: {}
- format:
template:
path: ./test_suites/integration/templates/reports/gh-action-sqlreport.j2
destination:
file:
directory: results/windows_idle_state_{{result_subdir}}/gh-actions-benchmark
Loading
Loading