ST_KNN
Introduction: join operation to find the k-nearest neighbors of a point or region in a spatial dataset. Format:ST_KNN(R: Table, S: Table, k: Integer, use_sphere: Boolean, search_radius: Double)
Rrepresents the queries side table.Srepresents the objects side table.Kdenotes the number of nearest neighbors to retrieve.use_sphereis a boolean value that specifies whether to calculate distances using the sphere model.search_radiusis an optional parameter that defines the maximum distance within which neighbors will be searched, without imposing any constraints on its value.
ST_KNN join does an inner join for the query side table (Table R). It returns only pairs where there is at least one matching neighbor within the k nearest neighbors. If a query point has no valid neighbor (e.g., because k is too large), it is excluded from the result.
Known limitations
ST_KNN join is a new syntax we introduced to Spatial SQL and therefore it has a few known limitations. But we are actively working on solving them.- Filter Pushdown Considerations
- Handling SQL-Defined Tables in ST_KNN Joins
ST_KNN joins, WherobotsDB may attempt to optimize the query in a way that bypasses the intended kNN join logic. Specifically, if you create DataFrames with hard-coded SQL, such as:
SQL Example
Suppose we have two tablesQUERIES and OBJECTS with the following data:
QUERIES table:
FALSE indicates that the spheroid distance model is not being used. Instead a planar distance model is being applied.
Since the search_radius parameter is not provided, there are no constraints on the maximum distance between the query and the objects.
Output:
search_radius Parameter
search_radius = 3.0. With this parameter, only neighbors within 3.0 units of each query geometry are considered.
The search_radius acts as a filter, removing any object geometries that are farther than the specified radius.
When search_radius is applied, the query may return fewer than K neighbors if there aren’t enough points within the radius.
Output:
QUERY_ID = 2), fewer than 4 neighbors are returned since not enough points are within the specified radius.
The search_radius ensures more focused results by excluding distant neighbors, which can be useful for scenarios requiring spatial proximity.
ST_AKNN
Introduction: join operation to find the k-nearest neighbors of a point or region in a spatial dataset. Format:ST_AKNN(R: Table, S: Table, k: Integer, use_sphee: Boolean, search_radius: Double)
The ST_AKNN function is similar to ST_KNN, but it uses approximate algorithms to find the k-nearest neighbors. This can be useful for large datasets where exact kNN search is computationally expensive. The trade-off is that approximate algorithms may not always return the exact k-nearest neighbors, but they provide a good approximation in a reasonable amount of time.
Known limitations
Similar to ST_KNN, ST_AKNN join is a new syntax we introduced to Spatial SQL.- Filter Pushdown Considerations
- Handling SQL-Defined Tables in ST_AKNN Joins

