Skip to content

Support geohash, geotile and geohex grid types #129581

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

craigtaverner
Copy link
Contributor

@craigtaverner craigtaverner commented Jun 17, 2025

The PR at #125143 added support for ST_GEOHASH, ST_GEOTILE and ST_GEOHEX. However, since it used long as the internal type for the grid id, there was need for many additional functions for converting the long to and from keyword as well as generating geo_shape cell bounds for map display. Each pf these involved many more files (for the functions, their docs and the generated evaluators). With inspiration from PostGIS we decided to take a different direction, and instead use a new internal type for each grid:

  • geohash for the ST_GEOHASH function, created from literal using either TO_GEOHASH(hash) or hash::geohash
  • geotile for the ST_GEOTILE function, created from literal using either TO_GEOTILE(tile) or tile::geotile
  • geohex for the ST_GEOHEX function, created from literal using either TO_GEOHEX(h3) or h3::geohex

This also leads to much stricter type checking as we can no longer use the long as a plain long in all functions that accept longs, or inadvertently using a geohash in a geotile function. However, the addition of new types involves a lot of boilerplate, especially considering the large number of functions that operate on all types, and need to be informed of the existence of these three new types.

One of the main goals of this work was to also support the concept of a geogrid search. This means "find all documents that intersect a grid cell", similar to the Query DSL version. This is achieved by enabling the use of these three new types in the ST_INTERSECTS function, so a query like this would satisfy the requirement:

FROM airports
| WHERE ST_INTERSECTS(location, "3/4/3"::geotile)

@craigtaverner craigtaverner added >enhancement :Analytics/Geo Indexing, search aggregations of geo points and shapes Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/ES|QL AKA ESQL v9.1.0 labels Jun 17, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine
Copy link
Collaborator

Hi @craigtaverner, I've created a changelog YAML for you.

long geoGridId = getGridId(point, gridId, gridType);
return gridId == geoGridId;
} else {
throw new IllegalArgumentException(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be easy to implement though. It is ok to do it in a follow up

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More over, I think we should do it generic, no point in distinguish a point from another geometry, we have code that that this already.

Copy link
Contributor

@iverase iverase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like a lot this approach as a grid cell cannot be really represented by a geometry. I am +1 to this approach as far as ESQL folks are good with it.

Copy link
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to carefully consider how to deal with the breaking change that this proposes, because we remove some functions altogether.

Additionally, the new functions added by this aren't snapshot-only, either, and will be made available immediately. It's better to have them bake a little before un-snapshotting them in a dedicated PR.

Finally, this adds multiple data types and wires them immediately - I much prefer we do that in a dedicated PR, this is a big change and can easily introduce bugs.

Comment on lines -406 to -413
def(StGeohashToLong.class, StGeohashToLong::new, "st_geohash_to_long"),
def(StGeohashToString.class, StGeohashToString::new, "st_geohash_to_string"),
def(StGeotile.class, StGeotile::new, "st_geotile"),
def(StGeotileToLong.class, StGeotileToLong::new, "st_geotile_to_long"),
def(StGeotileToString.class, StGeotileToString::new, "st_geotile_to_string"),
def(StGeohex.class, StGeohex::new, "st_geohex"),
def(StGeohexToLong.class, StGeohexToLong::new, "st_geohex_to_long"),
def(StGeohexToString.class, StGeohexToString::new, "st_geohex_to_string") },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is breaking. These functions were not snapshot-only (would have had to be placed in the snapshotFunctions below for that).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL :Analytics/Geo Indexing, search aggregations of geo points and shapes >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants