Skip to content

Conversation

@leontyevdv
Copy link
Contributor

@leontyevdv leontyevdv commented Dec 16, 2025

The issue can be found here: #139424

A rate query with an unsupported data type did not throw the correct exception. The rate query supports the data types "counter_long", "counter_integer" and "counter_double". When calling it with "gauge", the verifier skipped verification due to an incorrect assumption about the number of groupings. This has happened because we add a _timeseries grouping for the bare agg functions ([GROUP BY ALL] #136253), but not for the aggregations.

@leontyevdv leontyevdv requested review from a team and jan-elastic December 16, 2025 15:05
@leontyevdv leontyevdv self-assigned this Dec 16, 2025
@leontyevdv leontyevdv added >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/ES|QL AKA ESQL Team:StorageEngine :StorageEngine/ES|QL Timeseries / metrics / logsdb capabilities in ES|QL v9.3.0 labels Dec 16, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

@elasticsearchmachine
Copy link
Collaborator

Hi @leontyevdv, I've created a changelog YAML for you.

@leontyevdv leontyevdv requested a review from bpintea December 16, 2025 15:06
@kkrik-es kkrik-es added >non-issue and removed >bug labels Dec 16, 2025
@kkrik-es
Copy link
Contributor

This code hasn't been released, so not marking this as bug.

Fix the algorithm of the doc collection. The wrong parameter has been
passed to the shardRefCounted() method.
Copy link
Contributor

@bpintea bpintea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe could you please add a comment why the _timeseries attribute shouldn't be verified? AFAI can see at the sites where a FieldAttribute is instantiated, they do have a common data type.

var groupings = agg.groupings();
groupings.forEach(unresolvedExpressions);

// We don't count _timeseries which is added implicitly to grouping but not to a list of aggs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't followed the "group by all" development, so not sure why we treat this field specially, in that regard.
Also, TSID is a Metadata attribute so this one feels like should be one too?

But if these decisions have been taken already and there's a reason to go with them, fine by me.

Copy link
Contributor Author

@leontyevdv leontyevdv Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Bogdan! Thanks for the review. You can check the group-by-all PR here: github.com//pull/138595

The point i s that we add the _timeseries field in runtime in https://github.com/elastic/elasticsearch/blob/main/x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/TimeSeriesGroupByAll.java#L79 into to the grouping collection only (and not into the aggregates collection).

Later in https://github.com/elastic/elasticsearch/blob/main/x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/rules/physical/local/ExtractDimensionFieldsAfterAggregation.java#L111 we replace it by the Source field with a custom block loader which uses the synthetic source functionality.

This has been done in order to make the behavior closer to PromQL. So currently, only for bare aggregation functions (e.g. STATS rate(value)) we have two columns in the output: rate and timeseries.

The timeseries output looks like this:

rate:double        | _timeseries:keyword                                   | tbucket:datetime
1.6554700854700857 | "{""cluster"":""staging"",""pod"":""one"",""region"":""us""}"       | 2024-05-10T00:00:00.000Z

The aggregation itself is being done by _tsid which is a hash from all the dimensions (thus group by all).

In the code above, we check the _timeseries grouping in the groupings.forEach(unresolvedExpressions); line 181. The problem occurs when we subtract groupings.size() from aggregations.size(). Normally, the attribute presents in both - groupings and aggregations. The _timeseries presents in the groupings only.

We also store the name of the field in the MetadataAttribute but it's not a metadata one, but a dimension.

Comment on lines 184 to 186
boolean hasGroupByAll = groupings.stream()
.filter(g -> g instanceof FieldAttribute)
.anyMatch(f -> ((FieldAttribute) f).name().equals(MetadataAttribute.TIMESERIES));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see we do check a few Attributes in a similar way (by their names, against this constant) -- I think we should probably either model this attribute in a special name, or at least introduce some helper small method that does this check (potentially like isScoreAttribute.

Myself I think I'd use simple iterations over the groupings, rather than streams, since typically there's at most a handful of them, the cardinality is very low.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I extracted a similar to isScoreAttribute method and use it everywhere.

@leontyevdv leontyevdv requested a review from bpintea December 17, 2025 12:50
Copy link
Contributor

@bpintea bpintea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining to me how this attribute is involved in neither the grouping of data, nor aggregating over the groups, but there just to pipe it through and make it to the output.
Starting from that already implemented need, I think this makes now sense.

(Regarding that need in particular, however, we achieve a similar thing with INLINE STATS, though that is too big of a gun for this requirement, as it performs an actual join. But we might want to look at potentially other modeling ways of an Aggregate sublcass? for it, rather then twisting how Aggregate works.)

@leontyevdv leontyevdv merged commit 10129f2 into elastic:main Dec 18, 2025
35 checks passed
@kkrik-es
Copy link
Contributor

Backport to 9.3 too?

leontyevdv added a commit to leontyevdv/elasticsearch that referenced this pull request Dec 18, 2025
Fix the algorithm of the doc collection. The wrong parameter has been
passed to the shardRefCounted() method.

(cherry picked from commit 10129f2)
@leontyevdv
Copy link
Contributor Author

💚 All backports created successfully

Status Branch Result
9.3

Questions ?

Please refer to the Backport tool documentation

elasticsearchmachine pushed a commit that referenced this pull request Dec 18, 2025
Fix the algorithm of the doc collection. The wrong parameter has been
passed to the shardRefCounted() method.

(cherry picked from commit 10129f2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >non-issue :StorageEngine/ES|QL Timeseries / metrics / logsdb capabilities in ES|QL Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:StorageEngine v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants