Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TT-1741] performance comparison tool #1424

Merged
merged 64 commits into from
Dec 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
0337c29
preliminary version of comparison tool
Tofel Dec 2, 2024
d58b313
k8s resources reading + latest commit finding
Tofel Dec 3, 2024
3466334
fix commit lookup
Tofel Dec 3, 2024
605856c
split report into reusable components vol 1
Tofel Dec 3, 2024
a21de6f
make the tool more abstract
Tofel Dec 3, 2024
17531fd
correct propagation of context, no mutex -> use errgroup instead, one…
Tofel Dec 3, 2024
d574af5
add logger (unused yet)
Tofel Dec 3, 2024
c56cbd7
compose Reporter interface of smaller ones
Tofel Dec 3, 2024
d90596d
fix interfaces
Tofel Dec 3, 2024
f6e07ed
add type, start/end time to segments, constructor for BasicData, get …
Tofel Dec 4, 2024
fa8c711
add standard set of metrics/queries and a new test for them
Tofel Dec 4, 2024
a6bbbd3
add plain-segment-only notes/comments
Tofel Dec 4, 2024
072556d
remove most of resource reporter, what we want is actual cpu/mem usag…
Tofel Dec 4, 2024
7dd3cfa
add benchspy unit tests
Tofel Dec 4, 2024
a037c1c
execute benchspy tests in ci
Tofel Dec 4, 2024
dcdd2e0
fix test coverage check
Tofel Dec 4, 2024
68297e6
Merge remote-tracking branch 'origin/main' into tt-1741-performance-c…
Tofel Dec 6, 2024
36f4bd3
use median instead of average as a standard metric
Tofel Dec 6, 2024
c55d236
add generator unit tests
Tofel Dec 6, 2024
301819b
Merge remote-tracking branch 'origin' into tt-1741-performance-compar…
Tofel Dec 9, 2024
605096a
fix independence of generator tests
Tofel Dec 9, 2024
aa2a4eb
Merge branch 'main' into tt-1741-performance-comparison-tool
Tofel Dec 9, 2024
08bbb92
add Prometheus support
Tofel Dec 10, 2024
6d63f5f
remove ResourceFetcher, now Prometheus is just another QueryExecutor
Tofel Dec 10, 2024
7e27cca
fix existing tests
Tofel Dec 10, 2024
d2b61b7
add unit tests for prometheus
Tofel Dec 11, 2024
c2cf545
fix remaining unit tests
Tofel Dec 11, 2024
c2236dd
add helper methods for fetching current and previous report
Tofel Dec 11, 2024
305088e
more unit tests
Tofel Dec 11, 2024
570ebaa
add working test examples, some docs, small code changes
Tofel Dec 12, 2024
827b282
fix lints
Tofel Dec 13, 2024
a189b41
more docs
Tofel Dec 13, 2024
af2a330
one more doc
Tofel Dec 13, 2024
358108a
rename Generator to Direct
Tofel Dec 13, 2024
5a1c3e7
smoother docs
Tofel Dec 13, 2024
73df456
fix median calculation for missing data in examples
Tofel Dec 13, 2024
b847867
add explanation why p95 of direct and loki might not be the same
Tofel Dec 13, 2024
2c56a03
[Bot] Add automatically generated go documentation (#1474)
app-token-issuer-test-toolings[bot] Dec 13, 2024
7531863
update troubleshooting
skudasov Dec 9, 2024
fcc4c27
more docs
skudasov Dec 9, 2024
14a344e
Remove logs from flakeguard all test results (#1453)
lukaszcl Dec 9, 2024
54cc588
[TT-1725] go doc enhancements vol 2 (add tools, better comment in PR)…
Tofel Dec 10, 2024
f041164
Add metadata to Flakeguard report (#1473)
lukaszcl Dec 11, 2024
2884289
Separate JD database (#1472)
skudasov Dec 11, 2024
ba555a6
Fix url and add node container internal ip (#1477)
b-gopalswami Dec 12, 2024
93c902a
Flakeguard: improve report aggregation performance and omit output fo…
lukaszcl Dec 12, 2024
a3f2385
tiny adjustments to Seth docs (#1482)
Tofel Dec 16, 2024
554bb51
Keep test outputs for Flakeguard in separate fields (#1485)
lukaszcl Dec 16, 2024
d66b740
move back loki client test to lib/client
Tofel Dec 16, 2024
22a0417
do not use custom percentile function
Tofel Dec 17, 2024
9cd0fd7
cr changes, more tests, improved docs, real world examaple
Tofel Dec 18, 2024
d389c00
print table with Direct metrics
Tofel Dec 18, 2024
87d67d9
update docs, divide loki/direct results when casting per generator
Tofel Dec 18, 2024
f5e1c5a
Merge branch 'main' into tt-1741-performance-comparison-tool
Tofel Dec 18, 2024
72e7370
update reports doc, include WASP fix
Tofel Dec 18, 2024
987fee9
[Bot] Add automatically generated go documentation (#1486)
app-token-issuer-test-toolings[bot] Dec 18, 2024
e4d075d
Merge branch 'main' into tt-1741-performance-comparison-tool
Tofel Dec 18, 2024
a8e7ae4
lower cyclomatic complexity
Tofel Dec 18, 2024
befd51c
remove cover.html
Tofel Dec 18, 2024
70bd176
gitignore cover.html
Tofel Dec 18, 2024
cb02f63
Eliminate VU races, unify execution loop, remove cpu check loop (#1505)
skudasov Dec 19, 2024
2822f7c
use newer go doc generator
Tofel Dec 19, 2024
d1e6563
Merge branch 'main' into tt-1741-performance-comparison-tool
Tofel Dec 19, 2024
4edd870
ignore x/net vulerability
Tofel Dec 19, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/generate-go-docs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@
GOPRIVATE: github.com/smartcontractkit/generate-go-function-docs
run: |
git config --global url."https://x-access-token:${{ steps.setup-github-token-read.outputs.access-token }}@github.com/".insteadOf "https://github.com/"
go install github.com/smartcontractkit/[email protected].1
go install github.com/smartcontractkit/[email protected].2
go install github.com/jmank88/[email protected]
go install golang.org/x/tools/gopls@latest

Expand Down Expand Up @@ -111,7 +111,7 @@
shell: bash
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_DOC_GEN_API_KEY }}
run: |

Check failure on line 114 in .github/workflows/generate-go-docs.yaml

View workflow job for this annotation

GitHub Actions / actionlint

[actionlint] .github/workflows/generate-go-docs.yaml#L114

shellcheck reported issue in this script: SC2002:style:4:5: Useless cat. Consider 'cmd < file | ..' or 'cmd file | ..' instead [shellcheck]
Raw output
.github/workflows/generate-go-docs.yaml:114:9: shellcheck reported issue in this script: SC2002:style:4:5: Useless cat. Consider 'cmd < file | ..' or 'cmd file | ..' instead [shellcheck]
# Add go binary to PATH
PATH=$PATH:$(go env GOPATH)/bin
export PATH
Expand Down
27 changes: 27 additions & 0 deletions .github/workflows/wasp-test-benchspy.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
name: WASP's BenchSpy Go Tests
on: [push]
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
test:
defaults:
run:
working-directory: wasp
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: dorny/paths-filter@v3
id: changes
with:
filters: |
src:
- 'wasp/benchspy/**'
- uses: cachix/install-nix-action@08dcb3a5e62fa31e2da3d490afc4176ef55ecd72 # v30
if: steps.changes.outputs.src == 'true'
with:
nix_path: nixpkgs=channel:nixos-unstable
- name: Run tests
if: steps.changes.outputs.src == 'true'
run: |-
nix develop -c make test_benchspy_race
2 changes: 1 addition & 1 deletion .github/workflows/wasp-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
defaults:
run:
working-directory: wasp
runs-on: ubuntu-latest
runs-on: ubuntu22.04-16cores-64GB

Check failure on line 11 in .github/workflows/wasp-test.yml

View workflow job for this annotation

GitHub Actions / actionlint

[actionlint] .github/workflows/wasp-test.yml#L11

label "ubuntu22.04-16cores-64GB" is unknown. available labels are "windows-latest", "windows-latest-8-cores", "windows-2022", "windows-2019", "ubuntu-latest", "ubuntu-latest-4-cores", "ubuntu-latest-8-cores", "ubuntu-latest-16-cores", "ubuntu-24.04", "ubuntu-22.04", "ubuntu-20.04", "macos-latest", "macos-latest-xl", "macos-latest-xlarge", "macos-latest-large", "macos-15-xlarge", "macos-15-large", "macos-15", "macos-14-xl", "macos-14-xlarge", "macos-14-large", "macos-14", "macos-14.0", "macos-13-xl", "macos-13-xlarge", "macos-13-large", "macos-13", "macos-13.0", "macos-12-xl", "macos-12-xlarge", "macos-12-large", "macos-12", "macos-12.0", "self-hosted", "x64", "arm", "arm64", "linux", "macos", "windows". if it is a custom label for self-hosted runner, set list of labels in actionlint.yaml config file [runner-label]
Raw output
.github/workflows/wasp-test.yml:11:14: label "ubuntu22.04-16cores-64GB" is unknown. available labels are "windows-latest", "windows-latest-8-cores", "windows-2022", "windows-2019", "ubuntu-latest", "ubuntu-latest-4-cores", "ubuntu-latest-8-cores", "ubuntu-latest-16-cores", "ubuntu-24.04", "ubuntu-22.04", "ubuntu-20.04", "macos-latest", "macos-latest-xl", "macos-latest-xlarge", "macos-latest-large", "macos-15-xlarge", "macos-15-large", "macos-15", "macos-14-xl", "macos-14-xlarge", "macos-14-large", "macos-14", "macos-14.0", "macos-13-xl", "macos-13-xlarge", "macos-13-large", "macos-13", "macos-13.0", "macos-12-xl", "macos-12-xlarge", "macos-12-large", "macos-12", "macos-12.0", "self-hosted", "x64", "arm", "arm64", "linux", "macos", "windows". if it is a custom label for self-hosted runner, set list of labels in actionlint.yaml config file [runner-label]
steps:
- uses: actions/checkout@v3
- uses: dorny/paths-filter@v3
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ artifacts/

# Output of the go coverage tool, specifically when used with LiteIDE
*.out
cover.html

# Dependency directories (remove the comment below to include it)
# vendor/
Expand Down
3 changes: 2 additions & 1 deletion .nancy-ignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,5 @@ CVE-2024-24786 # CWE-835 Loop with Unreachable Exit Condition ('Infinite Loop')
CVE-2024-32972 # CWE-400: Uncontrolled Resource Consumption ('Resource Exhaustion') [still not fixed, not even in v1.13.8]
CVE-2023-42319 # CWE-noinfo: lol... go-ethereum v1.13.8 again
CVE-2024-10086 # Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')
CVE-2024-51744 # CWE-755: Improper Handling of Exceptional Conditions
CVE-2024-51744 # CWE-755: Improper Handling of Exceptional Conditions
CVE-2024-45338 # CWE-770: Allocation of Resources Without Limits or Throttling
17 changes: 17 additions & 0 deletions book/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,23 @@
- [Profile](./libs/wasp/components/profile.md)
- [Sampler](./libs/wasp/components/sampler.md)
- [Schedule](./libs/wasp/components/schedule.md)
- [BenchSpy](./libs/wasp/benchspy/overview.md)
- [Getting started](./libs/wasp/benchspy/getting_started.md)
- [Your first test](./libs/wasp/benchspy/first_test.md)
- [Simplest metrics](./libs/wasp/benchspy/simplest_metrics.md)
- [Standard Loki metrics](./libs/wasp/benchspy/loki_std.md)
- [Custom Loki metrics](./libs/wasp/benchspy/loki_custom.md)
- [Standard Prometheus metrics](./libs/wasp/benchspy/prometheus_std.md)
- [Custom Prometheus metrics](./libs/wasp/benchspy/prometheus_custom.md)
- [To Loki or not to Loki?](./libs/wasp/benchspy/loki_dillema.md)
- [Real world example](./libs/wasp/benchspy/real_world.md)
- [Reports](./libs/wasp/benchspy/reports/overview.md)
- [Standard Report](./libs/wasp/benchspy/reports/standard_report.md)
- [Adding new QueryExecutor](./libs/wasp/benchspy/reports/new_executor.md)
- [Adding new standard load metric]()
- [Adding new standard resource metric]()
- [Defining a new report](./libs/wasp/benchspy/reports/new_report.md)
- [Adding new storage]()
- [How to](./libs/wasp/how-to/overview.md)
- [Start local observability stack](./libs/wasp/how-to/start_local_observability_stack.md)
- [Try it out quickly](./libs/wasp/how-to/run_included_tests.md)
Expand Down
114 changes: 114 additions & 0 deletions book/src/libs/wasp/benchspy/first_test.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
# BenchSpy - Your First Test

Let's start with the simplest case, which doesn't require any part of the observability stack—only `WASP` and the application you are testing.
`BenchSpy` comes with built-in `QueryExecutors`, each of which also has predefined metrics that you can use. One of these executors is the `DirectQueryExecutor`, which fetches metrics directly from `WASP` generators,
which means you can run it with Loki.

> [!NOTE]
> Not sure whether to use `Loki` or `Direct` query executors? [Read this!](./loki_dillema.md)

## Test Overview

Our first test will follow this logic:
- Run a simple load test.
- Generate a performance report and store it.
- Run the load test again.
- Generate a new report and compare it to the previous one.

We'll use very simplified assertions for this example and expect the performance to remain unchanged.

### Step 1: Define and Run a Generator

Let's start by defining and running a generator that uses a mocked service:

```go
gen, err := wasp.NewGenerator(&wasp.Config{
T: t,
GenName: "vu",
CallTimeout: 100 * time.Millisecond,
LoadType: wasp.VU,
Schedule: wasp.Plain(10, 15*time.Second),
VU: wasp.NewMockVU(&wasp.MockVirtualUserConfig{
CallSleep: 50 * time.Millisecond,
}),
})
require.NoError(t, err)
gen.Run(true)
```

### Step 2: Generate a Baseline Performance Report

With load data available, let's generate a baseline performance report and store it in local storage:

```go
baseLineReport, err := benchspy.NewStandardReport(
// random hash, this should be the commit or hash of the Application Under Test (AUT)
"v1.0.0",
// use built-in queries for an executor that fetches data directly from the WASP generator
benchspy.WithStandardQueries(benchspy.StandardQueryExecutor_Direct),
// WASP generators
benchspy.WithGenerators(gen),
)
require.NoError(t, err, "failed to create baseline report")

fetchCtx, cancelFn := context.WithTimeout(context.Background(), 60*time.Second)
defer cancelFn()

fetchErr := baseLineReport.FetchData(fetchCtx)
require.NoError(t, fetchErr, "failed to fetch data for baseline report")

path, storeErr := baseLineReport.Store()
require.NoError(t, storeErr, "failed to store baseline report", path)
```

> [!NOTE]
> There's a lot to unpack here, and you're encouraged to read more about the built-in `QueryExecutors` and the standard metrics they provide as well as about the `StandardReport` [here](./reports/standard_report.md).
>
> For now, it's enough to know that the standard metrics provided by `StandardQueryExecutor_Direct` include:
> - Median latency
> - P95 latency (95th percentile)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason it's not P99, e.g. we consider p99 too noisy? It will show "worst cases"

Copy link
Collaborator

@skudasov skudasov Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say let's better add max latency to get extreme outliers. 95th and 99th are usually product requirements but you can rely on one or another and discuss it with stakeholders. MAX on the other hand shows you extreme outliers and if you have a strict SLA that, for example, "0 transactions are delivered after 2 minutes" you'll detect it with MAX and can miss with even 99th.

> - Max latency
> - Error rate

### Step 3: Run the Test Again and Compare Reports

With the baseline report ready, let's run the load test again. This time, we'll use a wrapper function to automatically load the previous report, generate a new one, and ensure they are comparable.

```go
// define a new generator using the same config values
newGen, err := wasp.NewGenerator(&wasp.Config{
T: t,
GenName: "vu",
CallTimeout: 100 * time.Millisecond,
LoadType: wasp.VU,
Schedule: wasp.Plain(10, 15*time.Second),
VU: wasp.NewMockVU(&wasp.MockVirtualUserConfig{
CallSleep: 50 * time.Millisecond,
}),
})
require.NoError(t, err)

// run the load
newGen.Run(true)

fetchCtx, cancelFn = context.WithTimeout(context.Background(), 60*time.Second)
defer cancelFn()

// currentReport is the report that we just created (baseLineReport)
currentReport, previousReport, err := benchspy.FetchNewStandardReportAndLoadLatestPrevious(
fetchCtx,
// commit or tag of the new application version
"v2.0.0",
benchspy.WithStandardQueries(benchspy.StandardQueryExecutor_Direct),
benchspy.WithGenerators(newGen),
)
require.NoError(t, err, "failed to fetch current report or load the previous one")
```

> [!NOTE]
> In a real-world case, once you've generated the first report, you should only need to use the `benchspy.FetchNewStandardReportAndLoadLatestPrevious` function.

### What's Next?

Now that we have two reports, how do we ensure that the application's performance meets expectations?
Find out in the [next chapter](./simplest_metrics.md).
14 changes: 14 additions & 0 deletions book/src/libs/wasp/benchspy/getting_started.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# BenchSpy - Getting Started

The following examples assume you have access to the following applications:
- Grafana
- Loki
- Prometheus

> [!NOTE]
> The easiest way to run these locally is by using CTFv2's [observability stack](../../../framework/observability/observability_stack.md).
> Be sure to install the `CTF CLI` first, as described in the [CTFv2 Getting Started](../../../framework/getting_started.md) guide.

Since BenchSpy is tightly coupled with WASP, we highly recommend that you [get familiar with it first](../overview.md) if you haven't already.

Ready? [Let's get started!](./first_test.md)
47 changes: 47 additions & 0 deletions book/src/libs/wasp/benchspy/loki_custom.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# BenchSpy - Custom Loki Metrics

In this chapter, we’ll explore how to use custom `LogQL` queries in the performance report. For this more advanced use case, we’ll manually compose the performance report.

The load generation part is the same as in the standard Loki metrics example and will be skipped.

## Defining Custom Metrics

Let’s define two illustrative metrics:
- **`vu_over_time`**: The rate of virtual users generated by WASP, using a 10-second window.
- **`responses_over_time`**: The number of AUT's responses, using a 1-second window.

```go
lokiQueryExecutor := benchspy.NewLokiQueryExecutor(
map[string]string{
"vu_over_time": fmt.Sprintf("max_over_time({branch=~\"%s\", commit=~\"%s\", go_test_name=~\"%s\", test_data_type=~\"stats\", gen_name=~\"%s\"} | json | unwrap current_instances [10s]) by (node_id, go_test_name, gen_name)", label, label, t.Name(), gen.Cfg.GenName),
"responses_over_time": fmt.Sprintf("sum(count_over_time({branch=~\"%s\", commit=~\"%s\", go_test_name=~\"%s\", test_data_type=~\"responses\", gen_name=~\"%s\"} [1s])) by (node_id, go_test_name, gen_name)", label, label, t.Name(), gen.Cfg.GenName),
},
gen.Cfg.LokiConfig,
)
```

> [!NOTE]
> These `LogQL` queries use the standard labels that `WASP` applies when sending data to Loki.

## Creating a `StandardReport` with Custom Queries

Now, let’s create a `StandardReport` using our custom queries:

```go
baseLineReport, err := benchspy.NewStandardReport(
"v1.0.0",
// notice the different functional option used to pass Loki executor with custom queries
benchspy.WithQueryExecutors(lokiQueryExecutor),
benchspy.WithGenerators(gen),
)
require.NoError(t, err, "failed to create baseline report")
```

## Wrapping Up

The rest of the code remains unchanged, except for the names of the metrics being asserted. You can find the full example [here](...).

Now it’s time to look at the last of the bundled `QueryExecutors`. Proceed to the [next chapter to read about Prometheus](./prometheus_std.md).

> [!NOTE]
> You can find the full example [here](https://github.com/smartcontractkit/chainlink-testing-framework/tree/main/wasp/examples/benchspy/loki_query_executor/loki_query_executor_test.go).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

404 https://github.com/smartcontractkit/chainlink-testing-framework/tree/main/wasp/examples/benchspy/loki_query_executor/loki_query_executor_test.go probably because it's added here but you point to main, so it will work after this PR is merged?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exactly, it will work only after this PR has been merged (this way I don't have to update it later)

39 changes: 39 additions & 0 deletions book/src/libs/wasp/benchspy/loki_dillema.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# BenchSpy - To Loki or Not to Loki?

You might be wondering whether to use the `Loki` or `Direct` query executor if all you need are basic latency metrics.

## Rule of Thumb

You should opt for the `Direct` query executor if all you need is a single number, such as the median latency or error rate, and you're not interested in:
- Comparing time series directly,
- Examining minimum or maximum values over time, or
- Performing advanced calculations on raw data,

## Why Choose `Direct`?

The `Direct` executor returns a single value for each standard metric using the same raw data that Loki would use. It accesses data stored in the `WASP` generator, which is later pushed to Loki.

This means you can:
- Run your load test without a Loki instance.
- Avoid calculating metrics like the median, 95th percentile latency, or error ratio yourself.

By using `Direct`, you save resources and simplify the process when advanced analysis isn't required.

> [!WARNING]
> Metrics calculated by the two query executors may differ slightly due to differences in their data processing and calculation methods:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is worth mentioning that in Direct mode some metrics can be omitted because of SliceBuffer. You need to set up buffer capacity and have enough memory on your runner, that's another limitation.

> - **`Direct` QueryExecutor**: This method processes all individual data points from the raw dataset, ensuring that every value is taken into account for calculations like averages, percentiles, or other statistics. It provides the most granular and precise results but may also be more sensitive to outliers and noise in the data.
> - **`Loki` QueryExecutor**: This method aggregates data using a default window size of 10 seconds. Within each window, multiple raw data points are combined (e.g., through averaging, summing, or other aggregation functions), which reduces the granularity of the dataset. While this approach can improve performance and reduce noise, it also smooths the data, which may obscure outliers or small-scale variability.

> #### Why This Matters for Percentiles:
> Percentiles, such as the 95th percentile (p95), are particularly sensitive to the granularity of the input data:
> - In the **`Direct` QueryExecutor**, the p95 is calculated across all raw data points, capturing the true variability of the dataset, including any extreme values or spikes.
> - In the **`Loki` QueryExecutor**, the p95 is calculated over aggregated data (i.e. using the 10-second window). As a result, the raw values within each window are smoothed into a single representative value, potentially lowering or altering the calculated p95. For example, an outlier that would significantly affect the p95 in the `Direct` calculation might be averaged out in the `Loki` window, leading to a slightly lower percentile value.

> #### Direct caveats:
> - **buffer limitations:** `WASP` generator use a [StringBuffer](https://github.com/smartcontractkit/chainlink-testing-framework/blob/main/wasp/buffer.go) with fixed size to store the responses. Once full capacity is reached
> oldest entries are replaced with incoming ones. The size of the buffer can be set in generator's config. By default, it is limited to 50k entries to lower resource consumption and potential OOMs.
>
> - **sampling:** `WASP` generators support optional sampling of successful responses. It is disabled by deafult, but if you do enable it, then the calculations would no longer be done over a full dataset.

> #### Key Takeaway:
> The difference arises because `Direct` prioritizes precision by using raw data, while `Loki` prioritizes efficiency and scalability by using aggregated data. When interpreting results, it’s essential to consider how the smoothing effect of `Loki` might impact the representation of variability or extremes in the dataset. This is especially important for metrics like percentiles, where such details can significantly influence the outcome.
Loading
Loading