-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Generalise exponential bucket histogram, add percentile calculation #127597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generalise exponential bucket histogram, add percentile calculation #127597
Conversation
Closes: ES-10531
…ool_queue_latency_metric
…ool_queue_latency_metric
…ool_queue_latency_metric
…ool_queue_latency_metric
…ialBucketHistogram.java Co-authored-by: Yang Wang <[email protected]>
…ool_queue_latency_metric
…ialBucketHistogram.java Co-authored-by: Dianna Hohensee <[email protected]>
Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination) |
* @return An array of frequencies of handling times in buckets with upper bounds as returned by {@link #getBucketUpperBounds()}, plus | ||
* an extra bucket for handling times longer than the longest upper bound. | ||
*/ | ||
public long[] getSnapshot() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed this from getHistogram
to getSnapshot
because histogram.getHistogram()
would look weird.
server/src/main/java/org/elasticsearch/common/metrics/ExponentialBucketHistogram.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For completeness it might be good to add a few more test cases noted below, otherwise LGTM.
if (percentile < 0 || percentile > 1) { | ||
throw new IllegalArgumentException("Requested percentile must be in [0, 1.0], percentile=" + percentile); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be good to add a few quick expectThrows()
tests for these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in 221e079
public void clear() { | ||
for (int i = 0; i < BUCKET_COUNT; i++) { | ||
buckets[i].reset(); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And a test for clear()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in 221e079
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm, thanks for splitting this out.
if (percentile < 0 || percentile > 1) { | ||
throw new IllegalArgumentException("Requested percentile must be in [0, 1.0], percentile=" + percentile); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++
public void clear() { | ||
for (int i = 0; i < BUCKET_COUNT; i++) { | ||
buckets[i].reset(); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++
|
||
/** | ||
* Tracks how long message handling takes on a transport thread as a histogram with fixed buckets. | ||
*/ | ||
public class HandlingTimeTracker { | ||
public class HandlingTimeTracker extends ExponentialBucketHistogram { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided to keep the HandlingTimeTracker
because the decision to have 17 + 1 buckets is contextual and should be fixed because it's used in serialization. It also reduces the size of the change.
HandlingTimeTracker
is now just an ExponentialBucketHistogram
of a specific size.
public ExponentialBucketHistogram(int bucketCount) { | ||
if (bucketCount < 2 || bucketCount > Integer.SIZE) { | ||
throw new IllegalArgumentException("Bucket count must be in [2, " + Integer.SIZE + "], got " + bucketCount); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The observations are long
but the bucket bounds are integers, I suppose because the scale of the numbers we've been putting in here are easily contained in the integer range. Perhaps we could revisit this if we need histograms to hold larger values, but that's not the case for the queue latency metric.
I want to re-use the exponential-bucket histogram implemented as
HandlingTimeTracker
for our queue latency metric.This PR renames it, and references to it, to make it more suitable for re-use and adds
getPercentile
which can be used to calculate a percentile value from its contents.See #120488
Relates ES-10531
Note there's a massive amount of commits because I just took the other PR and removed the metrics bits. It should all get squashed away anyway.