-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Fix bare time series aggs server error #139612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bare time series aggs server error #139612
Conversation
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
|
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
|
Hi @leontyevdv, I've created a changelog YAML for you. |
|
This code hasn't been released, so not marking this as |
Fix the algorithm of the doc collection. The wrong parameter has been passed to the shardRefCounted() method.
bpintea
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe could you please add a comment why the _timeseries attribute shouldn't be verified? AFAI can see at the sites where a FieldAttribute is instantiated, they do have a common data type.
| var groupings = agg.groupings(); | ||
| groupings.forEach(unresolvedExpressions); | ||
|
|
||
| // We don't count _timeseries which is added implicitly to grouping but not to a list of aggs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't followed the "group by all" development, so not sure why we treat this field specially, in that regard.
Also, TSID is a Metadata attribute so this one feels like should be one too?
But if these decisions have been taken already and there's a reason to go with them, fine by me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Bogdan! Thanks for the review. You can check the group-by-all PR here: github.com//pull/138595
The point i s that we add the _timeseries field in runtime in https://github.com/elastic/elasticsearch/blob/main/x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/TimeSeriesGroupByAll.java#L79 into to the grouping collection only (and not into the aggregates collection).
Later in https://github.com/elastic/elasticsearch/blob/main/x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/rules/physical/local/ExtractDimensionFieldsAfterAggregation.java#L111 we replace it by the Source field with a custom block loader which uses the synthetic source functionality.
This has been done in order to make the behavior closer to PromQL. So currently, only for bare aggregation functions (e.g. STATS rate(value)) we have two columns in the output: rate and timeseries.
The timeseries output looks like this:
rate:double | _timeseries:keyword | tbucket:datetime
1.6554700854700857 | "{""cluster"":""staging"",""pod"":""one"",""region"":""us""}" | 2024-05-10T00:00:00.000Z
The aggregation itself is being done by _tsid which is a hash from all the dimensions (thus group by all).
In the code above, we check the _timeseries grouping in the groupings.forEach(unresolvedExpressions); line 181. The problem occurs when we subtract groupings.size() from aggregations.size(). Normally, the attribute presents in both - groupings and aggregations. The _timeseries presents in the groupings only.
We also store the name of the field in the MetadataAttribute but it's not a metadata one, but a dimension.
| boolean hasGroupByAll = groupings.stream() | ||
| .filter(g -> g instanceof FieldAttribute) | ||
| .anyMatch(f -> ((FieldAttribute) f).name().equals(MetadataAttribute.TIMESERIES)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see we do check a few Attributes in a similar way (by their names, against this constant) -- I think we should probably either model this attribute in a special name, or at least introduce some helper small method that does this check (potentially like isScoreAttribute.
Myself I think I'd use simple iterations over the groupings, rather than streams, since typically there's at most a handful of them, the cardinality is very low.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I extracted a similar to isScoreAttribute method and use it everywhere.
bpintea
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for explaining to me how this attribute is involved in neither the grouping of data, nor aggregating over the groups, but there just to pipe it through and make it to the output.
Starting from that already implemented need, I think this makes now sense.
(Regarding that need in particular, however, we achieve a similar thing with INLINE STATS, though that is too big of a gun for this requirement, as it performs an actual join. But we might want to look at potentially other modeling ways of an Aggregate sublcass? for it, rather then twisting how Aggregate works.)
|
Backport to 9.3 too? |
Fix the algorithm of the doc collection. The wrong parameter has been passed to the shardRefCounted() method. (cherry picked from commit 10129f2)
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
The issue can be found here: #139424
A rate query with an unsupported data type did not throw the correct exception. The rate query supports the data types "counter_long", "counter_integer" and "counter_double". When calling it with "gauge", the verifier skipped verification due to an incorrect assumption about the number of groupings. This has happened because we add a
_timeseriesgrouping for the bare agg functions ([GROUP BY ALL] #136253), but not for the aggregations.