Skip to content

Add benchmark overview doc#1528

Open
cijothomas wants to merge 30 commits intoopen-telemetry:mainfrom
cijothomas:cijothomas/benchoverviewdoc1
Open

Add benchmark overview doc#1528
cijothomas wants to merge 30 commits intoopen-telemetry:mainfrom
cijothomas:cijothomas/benchoverviewdoc1

Conversation

@cijothomas
Copy link
Copy Markdown
Member

This docs is an attempt at the schema for our Phase 2 performance summary, when phase 2 is completed. It defines the key scenarios (Idle, 100k Load, Saturation) and the comparative analysis with OTLP/Collector. I've put TBD for actual numbers, as this is just attempting to finalize what we want to have in an easy to consume format. Actual numbers will be filled in later. This can also be used to see if there are gaps in the perf test suites that we want to add.

The existing pages like https://open-telemetry.github.io/otel-arrow/benchmarks/nightly/backpressure/ are still retained. This doc will have distilled information from them.

@codecov
Copy link
Copy Markdown

codecov bot commented Dec 4, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.71%. Comparing base (cd545fa) to head (d87fddc).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1528      +/-   ##
==========================================
- Coverage   87.73%   87.71%   -0.02%     
==========================================
  Files         578      578              
  Lines      198334   198413      +79     
==========================================
+ Hits       174012   174044      +32     
- Misses      23796    23843      +47     
  Partials      526      526              
Components Coverage Δ
otap-dataflow 89.74% <ø> (-0.03%) ⬇️
query_abstraction 80.61% <ø> (ø)
query_engine 90.61% <ø> (ø)
syslog_cef_receivers ∅ <ø> (∅)
otel-arrow-go 52.44% <ø> (ø)
quiver 91.91% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@lquerel lquerel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this document.

I think we should also include the OTLP to OTLP scenario in the different sections since it will be one of the most common scenarios, at least in the beginning.

I also think we should add the wait_for_result mode in the otel-arrow section because it provides a true end to end unified ack/nack mechanism, which I believe is not fully supported by the Go collector.

github-merge-queue bot pushed a commit that referenced this pull request Jan 14, 2026
Fixed one TODO!

#1528 - Still working
on this separately, which will include actual numbers for key scenarios,
so readers don't have to go through the graphs themselves!
@cijothomas cijothomas marked this pull request as ready for review January 17, 2026 00:54
@cijothomas cijothomas requested a review from a team as a code owner January 17, 2026 00:54
@github-actions
Copy link
Copy Markdown

github-actions bot commented Feb 4, 2026

This pull request has been marked as stale due to lack of recent activity. It will be closed in 30 days if no further activity occurs. If this PR is still relevant, please comment or push new commits to keep it active.

@github-actions github-actions bot added the stale Not actively pursued label Feb 4, 2026
@jmacd jmacd removed the stale Not actively pursued label Feb 4, 2026
@github-actions
Copy link
Copy Markdown

This pull request has been marked as stale due to lack of recent activity. It will be closed in 30 days if no further activity occurs. If this PR is still relevant, please comment or push new commits to keep it active.

@github-actions github-actions bot added the stale Not actively pursued label Feb 26, 2026
@jmacd jmacd removed the stale Not actively pursued label Mar 5, 2026
performance characteristics and efficient resource utilization across varying
load conditions. The engine uses a [thread-per-core
architecture](#thread-per-core-design) where resource consumption scales with
the number of configured cores.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resource consumption scales with the number of configured cores

I found this a bit hard to interpret. Do you mean "the throughput scales with the number of configured CPU cores, almost in a linear fashion?"

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya it reads weird.. I'll update with a better wording
(I meant throughput scales linearly, but so does memory consumption)

All performance tests are executed on bare-metal compute instance with the
following specifications:

- **CPU**: 64 physical cores / 128 logical cores (x86-64 architecture)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider calling out the number of NUMA groups?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya. We just confirmed that the CNCF machine has 2 sockets
https://github.com/open-telemetry/otel-arrow/actions/runs/23278418373#summary-67686308469

So far no tests were run with engine running on more than 32 cores (and they were all in same node). I have to see if we can actually do that, given load-gen and fake-backend also needs cores to run on.


### Test Environment

All performance tests are executed on bare-metal compute instance with the
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
All performance tests are executed on bare-metal compute instance with the
All performance tests are executed on a dedicated bare-metal compute instance with the

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is a shared resource or dedicated, consider calling it out.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is dedicated. Will update

Comment on lines +55 to +57
*Note: CPU usage is normalized (percentage of total system capacity). Memory
usage scales with core count due to the [thread-per-core
architecture](#thread-per-core-design).*
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Memory usage could be confusing (cached, shared, non-paged pool, virtual memory vs. physical memory, ...), consider aligning and pointing to the OTel System metrics semantic conventions https://github.com/open-telemetry/semantic-conventions/blob/main/docs/system/system-metrics.md.

This represents the optimal scenario where the dataflow engine operates with its
native protocol end-to-end, eliminating protocol conversion overhead.

##### Standard Load - OTLP -> OTLP (Standard Protocol)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which OTLP? (gRPC, proto via HTTP 1.1, JSON, TLS enabled vs. not)

engine and the OpenTelemetry Collector, we use **Syslog (UDP/TCP)** as the
ingress protocol for both systems.

#### Rationale for Syslog-Based Comparison
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#### Rationale for Syslog-Based Comparison
#### Rationale for Syslog-based Comparison


Scaling Efficiency = (Throughput at N cores) / (N * Single-core throughput)

### Architecture
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a bit weird to have an architecture section in the benchmark document (unless it is talking about the benchmarking environment's own architecture).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

6 participants