> ## Documentation Index
> Fetch the complete documentation index at: https://docs.wherobots.com/llms.txt
> Use this file to discover all available pages before exploring further.

# RDD

This page explains how to customize the internal algorithms used in Spatial SQL geospatial queries. Such customization requires Spatial RDD, which is not recommended by us.

## Convert a Spatial DataFrame to SpatialRDD

Use WherobotsDB DataFrame-RDD Adapter to convert a DataFrame to an SpatialRDD.

<Tabs>
  <Tab title="Python">
    ```python theme={"system"}
    from sedona.spark import *

    spatialRDD = Adapter.toSpatialRdd(spatialDf, "usacounty")
    ```
  </Tab>

  <Tab title="Scala">
    ```scala theme={"system"}
    var spatialRDD = Adapter.toSpatialRdd(spatialDf, "usacounty")
    ```
  </Tab>

  <Tab title="Java">
    ```java theme={"system"}
    SpatialRDD spatialRDD = Adapter.toSpatialRdd(spatialDf, "usacounty")
    ```
  </Tab>
</Tabs>

"usacounty" is the name of the geometry column

<Warning>
  Only one Geometry type column is allowed per DataFrame.
</Warning>

## Write a Spatial Range Query

A spatial range query takes as input a range query window and an SpatialRDD and returns all geometries that have specified relationship with the query window.

Assume you now have an SpatialRDD (typed or generic). You can use the following code to issue an Spatial Range Query on it.

\==spatialPredicate== can be set to `SpatialPredicate.INTERSECTS` to return all geometries intersect with query window. Supported spatial predicates are:

* `CONTAINS`: geometry is completely inside the query window
* `INTERSECTS`: geometry have at least one point in common with the query window
* `WITHIN`: geometry is completely within the query window (no touching edges)
* `COVERS`: query window has no point outside of the geometry
* `COVERED_BY`: geometry has no point outside of the query window
* `OVERLAPS`: geometry and the query window spatially overlap
* `CROSSES`: geometry and the query window spatially cross
* `TOUCHES`: the only points shared between geometry and the query window are on the boundary of geometry and the query window
* `EQUALS`: geometry and the query window are spatially equal

<Note>
  Spatial range query is equivalent with a SELECT query with spatial predicate as search condition in Spatial SQL. An example query is as follows:
</Note>

```sql theme={"system"}
SELECT *
FROM checkin
WHERE ST_Intersects(checkin.location, queryWindow)
```

<Tabs>
  <Tab title="Python">
    ```python theme={"system"}
    from sedona.spark.core.geom.envelope import Envelope
    from sedona.spark import *

    range_query_window = Envelope(-90.01, -80.01, 30.01, 40.01)
    consider_boundary_intersection = False  ## Only return gemeotries fully covered by the window
    using_index = False
    query_result = RangeQuery.SpatialRangeQuery(spatial_rdd, range_query_window, consider_boundary_intersection, using_index)
    ```
  </Tab>

  <Tab title="Scala">
    ```scala theme={"system"}
    val rangeQueryWindow = new Envelope(-90.01, -80.01, 30.01, 40.01)
    val spatialPredicate = SpatialPredicate.COVERED_BY // Only return gemeotries fully covered by the window
    val usingIndex = false
    var queryResult = RangeQuery.SpatialRangeQuery(spatialRDD, rangeQueryWindow, spatialPredicate, usingIndex)
    ```
  </Tab>

  <Tab title="Java">
    ```java theme={"system"}
    Envelope rangeQueryWindow = new Envelope(-90.01, -80.01, 30.01, 40.01)
    SpatialPredicate spatialPredicate = SpatialPredicate.COVERED_BY // Only return gemeotries fully covered by the window
    boolean usingIndex = false
    JavaRDD queryResult = RangeQuery.SpatialRangeQuery(spatialRDD, rangeQueryWindow, spatialPredicate, usingIndex)
    ```
  </Tab>
</Tabs>

<Note>
  WherobotsDB Python users: Please use RangeQueryRaw from the same module if you want to avoid jvm python serde while converting to Spatial DataFrame. It takes the same parameters as RangeQuery but returns reference to jvm rdd which can be converted to dataframe without python - jvm serde using Adapter.
</Note>

Example:

```python theme={"system"}
from sedona.spark.core.geom.envelope import Envelope
from sedona.spark import *

range_query_window = Envelope(-90.01, -80.01, 30.01, 40.01)
consider_boundary_intersection = False  ## Only return gemeotries fully covered by the window
using_index = False
query_result = RangeQueryRaw.SpatialRangeQuery(spatial_rdd, range_query_window, consider_boundary_intersection, using_index)
gdf = Adapter.toDf(query_result, spark, ["col1", ..., "coln"])
```

### Range query window

Besides the rectangle (Envelope) type range query window, WherobotsDB range query window can be Point/Polygon/LineString.

The code to create a point, linestring (4 vertices) and polygon (4 vertices) is as follows:

<Tabs>
  <Tab title="Python">
    A Shapely geometry can be used as a query window. To create shapely geometries, please follow the [Shapely official docs](https://shapely.readthedocs.io/en/stable/manual.html).
  </Tab>

  <Tab title="Scala">
    ```scala theme={"system"}
    val geometryFactory = new GeometryFactory()
    val pointObject = geometryFactory.createPoint(new Coordinate(-84.01, 34.01))

    val geometryFactory = new GeometryFactory()
    val coordinates = new Array[Coordinate](5)
    coordinates(0) = new Coordinate(0,0)
    coordinates(1) = new Coordinate(0,4)
    coordinates(2) = new Coordinate(4,4)
    coordinates(3) = new Coordinate(4,0)
    coordinates(4) = coordinates(0) // The last coordinate is the same as the first coordinate in order to compose a closed ring
    val polygonObject = geometryFactory.createPolygon(coordinates)

    val geometryFactory = new GeometryFactory()
    val coordinates = new Array[Coordinate](4)
    coordinates(0) = new Coordinate(0,0)
    coordinates(1) = new Coordinate(0,4)
    coordinates(2) = new Coordinate(4,4)
    coordinates(3) = new Coordinate(4,0)
    val linestringObject = geometryFactory.createLineString(coordinates)
    ```
  </Tab>

  <Tab title="Java">
    ```java theme={"system"}
    GeometryFactory geometryFactory = new GeometryFactory()
    Point pointObject = geometryFactory.createPoint(new Coordinate(-84.01, 34.01))

    GeometryFactory geometryFactory = new GeometryFactory()
    Coordinate[] coordinates = new Array[Coordinate](5)
    coordinates(0) = new Coordinate(0,0)
    coordinates(1) = new Coordinate(0,4)
    coordinates(2) = new Coordinate(4,4)
    coordinates(3) = new Coordinate(4,0)
    coordinates(4) = coordinates(0) // The last coordinate is the same as the first coordinate in order to compose a closed ring
    Polygon polygonObject = geometryFactory.createPolygon(coordinates)

    GeometryFactory geometryFactory = new GeometryFactory()
    val coordinates = new Array[Coordinate](4)
    coordinates(0) = new Coordinate(0,0)
    coordinates(1) = new Coordinate(0,4)
    coordinates(2) = new Coordinate(4,4)
    coordinates(3) = new Coordinate(4,0)
    LineString linestringObject = geometryFactory.createLineString(coordinates)
    ```
  </Tab>
</Tabs>

### Use spatial indexes

WherobotsDB provides two types of spatial indexes, Quad-Tree and R-Tree. Once you specify an index type, WherobotsDB will build a local tree index on each of the SpatialRDD partition.

To utilize a spatial index in a spatial range query, use the following code:

<Tabs>
  <Tab title="Python">
    ```python theme={"system"}
    from sedona.spark.core.geom.envelope import Envelope
    from sedona.spark import *

    range_query_window = Envelope(-90.01, -80.01, 30.01, 40.01)
    consider_boundary_intersection = False ## Only return gemeotries fully covered by the window

    build_on_spatial_partitioned_rdd = False ## Set to TRUE only if run join query
    spatial_rdd.buildIndex(IndexType.QUADTREE, build_on_spatial_partitioned_rdd)

    using_index = True

    query_result = RangeQuery.SpatialRangeQuery(
            spatial_rdd,
            range_query_window,
            consider_boundary_intersection,
            using_index
    )
    ```
  </Tab>

  <Tab title="Scala">
    ```scala theme={"system"}
    val rangeQueryWindow = new Envelope(-90.01, -80.01, 30.01, 40.01)
    val spatialPredicate = SpatialPredicate.COVERED_BY // Only return gemeotries fully covered by the window

    val buildOnSpatialPartitionedRDD = false // Set to TRUE only if run join query
    spatialRDD.buildIndex(IndexType.QUADTREE, buildOnSpatialPartitionedRDD)

    val usingIndex = true
    var queryResult = RangeQuery.SpatialRangeQuery(spatialRDD, rangeQueryWindow, spatialPredicate, usingIndex)
    ```
  </Tab>

  <Tab title="Java">
    ```java theme={"system"}
    Envelope rangeQueryWindow = new Envelope(-90.01, -80.01, 30.01, 40.01)
    SpatialPredicate spatialPredicate = SpatialPredicate.COVERED_BY // Only return gemeotries fully covered by the window

    boolean buildOnSpatialPartitionedRDD = false // Set to TRUE only if run join query
    spatialRDD.buildIndex(IndexType.QUADTREE, buildOnSpatialPartitionedRDD)

    boolean usingIndex = true
    JavaRDD queryResult = RangeQuery.SpatialRangeQuery(spatialRDD, rangeQueryWindow, spatialPredicate, usingIndex)
    ```
  </Tab>
</Tabs>

<Tip>
  Using an index might not be the best choice all the time because building index also takes time. A spatial index is very useful when your data is complex polygons and line strings.
</Tip>

### Output format

<Tabs>
  <Tab title="Python">
    The output format of the spatial range query is another RDD which consists of GeoData objects.

    SpatialRangeQuery result can be used as RDD with map or other spark RDD functions. Also it can be used as
    Python objects when using collect method.
    Example:

    ```python theme={"system"}
    query_result.map(lambda x: x.geom.length).collect()
    ```

    ```
    [
     1.5900840000000045,
     1.5906639999999896,
     1.1110299999999995,
     1.1096700000000084,
     1.1415619999999933,
     1.1386399999999952,
     1.1415619999999933,
     1.1418860000000137,
     1.1392780000000045,
     ...
    ]
    ```

    Or transformed to GeoPandas GeoDataFrame

    ```python theme={"system"}
    import geopandas as gpd
    gpd.GeoDataFrame(
            query_result.map(lambda x: [x.geom, x.userData]).collect(),
            columns=["geom", "user_data"],
            geometry="geom"
    )
    ```
  </Tab>

  <Tab title="Scala/Java">
    The output format of the spatial range query is another SpatialRDD.
  </Tab>
</Tabs>

## Write a Spatial KNN Query

A spatial K Nearnest Neighbor query takes as input a K, a query point and an SpatialRDD and finds the K geometries in the RDD which are the closest to he query point.

Assume you now have an SpatialRDD (typed or generic). You can use the following code to issue an Spatial KNN Query on it.

<Tabs>
  <Tab title="Python">
    ```python theme={"system"}
    from sedona.spark import *
    from shapely.geometry import Point

    point = Point(-84.01, 34.01)
    k = 1000 ## K Nearest Neighbors
    using_index = False
    result = KNNQuery.SpatialKnnQuery(object_rdd, point, k, using_index)
    ```
  </Tab>

  <Tab title="Scala">
    ```scala theme={"system"}
    val geometryFactory = new GeometryFactory()
    val pointObject = geometryFactory.createPoint(new Coordinate(-84.01, 34.01))
    val K = 1000 // K Nearest Neighbors
    val usingIndex = false
    val result = KNNQuery.SpatialKnnQuery(objectRDD, pointObject, K, usingIndex)
    ```
  </Tab>

  <Tab title="Java">
    ```java theme={"system"}
    GeometryFactory geometryFactory = new GeometryFactory()
    Point pointObject = geometryFactory.createPoint(new Coordinate(-84.01, 34.01))
    int K = 1000 // K Nearest Neighbors
    boolean usingIndex = false
    JavaRDD result = KNNQuery.SpatialKnnQuery(objectRDD, pointObject, K, usingIndex)
    ```
  </Tab>
</Tabs>

<Note>
  Spatial KNN query that returns 5 Nearest Neighbors is equal to the following statement in Spatial SQL
</Note>

```sql theme={"system"}
SELECT ck.name, ck.rating, ST_Distance(ck.location, myLocation) AS distance
FROM checkins ck
ORDER BY distance DESC
LIMIT 5
```

### Query center geometry

Besides the Point type, WherobotsDB KNN query center can be Polygon and LineString.

<Tabs>
  <Tab title="Python">
    To create Polygon or Linestring object please follow [Shapely official docs](https://shapely.readthedocs.io/en/stable/manual.html)
  </Tab>

  <Tab title="Scala/Java">
    To learn how to create Polygon and LineString object, see [Range query window](#range-query-window).
  </Tab>
</Tabs>

### Use spatial indexes

To utilize a spatial index in a spatial KNN query, use the following code:

<Tabs>
  <Tab title="Python">
    ```python theme={"system"}
    from sedona.spark import *
    from shapely.geometry import Point

    point = Point(-84.01, 34.01)
    k = 5 ## K Nearest Neighbors

    build_on_spatial_partitioned_rdd = False ## Set to TRUE only if run join query
    spatial_rdd.buildIndex(IndexType.RTREE, build_on_spatial_partitioned_rdd)

    using_index = True
    result = KNNQuery.SpatialKnnQuery(spatial_rdd, point, k, using_index)
    ```
  </Tab>

  <Tab title="Scala">
    ```scala theme={"system"}
    val geometryFactory = new GeometryFactory()
    val pointObject = geometryFactory.createPoint(new Coordinate(-84.01, 34.01))
    val K = 1000 // K Nearest Neighbors


    val buildOnSpatialPartitionedRDD = false // Set to TRUE only if run join query
    objectRDD.buildIndex(IndexType.RTREE, buildOnSpatialPartitionedRDD)

    val usingIndex = true
    val result = KNNQuery.SpatialKnnQuery(objectRDD, pointObject, K, usingIndex)
    ```
  </Tab>

  <Tab title="Java">
    ```java theme={"system"}
    GeometryFactory geometryFactory = new GeometryFactory()
    Point pointObject = geometryFactory.createPoint(new Coordinate(-84.01, 34.01))
    val K = 1000 // K Nearest Neighbors


    boolean buildOnSpatialPartitionedRDD = false // Set to TRUE only if run join query
    objectRDD.buildIndex(IndexType.RTREE, buildOnSpatialPartitionedRDD)

    boolean usingIndex = true
    JavaRDD result = KNNQuery.SpatialKnnQuery(objectRDD, pointObject, K, usingIndex)
    ```
  </Tab>
</Tabs>

<Warning>
  Only R-Tree index supports Spatial KNN query
</Warning>

### Output format

<Tabs>
  <Tab title="Python">
    The output format of the spatial KNN query is a list of GeoData objects. The list has K GeoData objects.

    Example:

    ```python theme={"system"}
    >> result

    [GeoData, GeoData, GeoData, GeoData, GeoData]
    ```
  </Tab>

  <Tab title="Scala/Java">
    The output format of the spatial KNN query is a list of geometries. The list has K geometry objects.
  </Tab>
</Tabs>

## Write a Spatial Join Query

A spatial join query takes as input two Spatial RDD A and B. For each geometry in A, finds the geometries (from B) covered/intersected by it. A and B can be any geometry type and are not necessary to have the same geometry type.

Assume you now have two SpatialRDDs (typed or generic). You can use the following code to issue an Spatial Join Query on them.

<Tabs>
  <Tab title="Python">
    ```python theme={"system"}
    from sedona.spark import *

    consider_boundary_intersection = False ## Only return geometries fully covered by each query window in queryWindowRDD
    using_index = False

    object_rdd.analyze()

    object_rdd.spatialPartitioning(GridType.KDBTREE)
    query_window_rdd.spatialPartitioning(object_rdd.getPartitioner())

    result = JoinQuery.SpatialJoinQuery(object_rdd, query_window_rdd, using_index, consider_boundary_intersection)
    ```
  </Tab>

  <Tab title="Scala">
    ```scala theme={"system"}
    val spatialPredicate = SpatialPredicate.COVERED_BY // Only return gemeotries fully covered by each query window in queryWindowRDD
    val usingIndex = false

    objectRDD.analyze()

    objectRDD.spatialPartitioning(GridType.KDBTREE)
    queryWindowRDD.spatialPartitioning(objectRDD.getPartitioner)

    val result = JoinQuery.SpatialJoinQuery(objectRDD, queryWindowRDD, usingIndex, spatialPredicate)
    ```
  </Tab>

  <Tab title="Java">
    ```java theme={"system"}
    SpatialPredicate spatialPredicate = SpatialPredicate.COVERED_BY // Only return gemeotries fully covered by each query window in queryWindowRDD
    val usingIndex = false

    objectRDD.analyze()

    objectRDD.spatialPartitioning(GridType.KDBTREE)
    queryWindowRDD.spatialPartitioning(objectRDD.getPartitioner)

    JavaPairRDD result = JoinQuery.SpatialJoinQuery(objectRDD, queryWindowRDD, usingIndex, spatialPredicate)
    ```
  </Tab>
</Tabs>

<Note>
  Spatial join query is equal to the following query in Spatial SQL:
</Note>

```sql theme={"system"}
SELECT superhero.name
FROM city, superhero
WHERE ST_Contains(city.geom, superhero.geom);
```

Find the superheroes in each city

### Use spatial partitioning

Sedona spatial partitioning method can significantly speed up the join query. Three spatial partitioning methods are available: KDB-Tree, Quad-Tree and R-Tree. Two SpatialRDD must be partitioned by the same way.

If you first partition SpatialRDD A, then you must use the partitioner of A to partition B.

<Tabs>
  <Tab title="Python">
    ```python theme={"system"}
    object_rdd.spatialPartitioning(GridType.KDBTREE)
    query_window_rdd.spatialPartitioning(object_rdd.getPartitioner())
    ```
  </Tab>

  <Tab title="Scala/Java">
    ```scala theme={"system"}
    objectRDD.spatialPartitioning(GridType.KDBTREE)
    queryWindowRDD.spatialPartitioning(objectRDD.getPartitioner)
    ```
  </Tab>
</Tabs>

Or

<Tabs>
  <Tab title="Python">
    ```python theme={"system"}
    query_window_rdd.spatialPartitioning(GridType.KDBTREE)
    object_rdd.spatialPartitioning(query_window_rdd.getPartitioner())
    ```
  </Tab>

  <Tab title="Scala/Java">
    ```scala theme={"system"}
    queryWindowRDD.spatialPartitioning(GridType.KDBTREE)
    objectRDD.spatialPartitioning(queryWindowRDD.getPartitioner)
    ```
  </Tab>
</Tabs>

### Use spatial indexes

To utilize a spatial index in a spatial join query, use the following code:

<Tabs>
  <Tab title="Python">
    ```python theme={"system"}
    from sedona.spark import *

    object_rdd.spatialPartitioning(GridType.KDBTREE)
    query_window_rdd.spatialPartitioning(object_rdd.getPartitioner())

    build_on_spatial_partitioned_rdd = True ## Set to TRUE only if run join query
    using_index = True
    query_window_rdd.buildIndex(IndexType.QUADTREE, build_on_spatial_partitioned_rdd)

    result = JoinQuery.SpatialJoinQueryFlat(object_rdd, query_window_rdd, using_index, True)
    ```
  </Tab>

  <Tab title="Scala">
    ```scala theme={"system"}
    objectRDD.spatialPartitioning(joinQueryPartitioningType)
    queryWindowRDD.spatialPartitioning(objectRDD.getPartitioner)

    val buildOnSpatialPartitionedRDD = true // Set to TRUE only if run join query
    val usingIndex = true
    queryWindowRDD.buildIndex(IndexType.QUADTREE, buildOnSpatialPartitionedRDD)

    val result = JoinQuery.SpatialJoinQueryFlat(objectRDD, queryWindowRDD, usingIndex, spatialPredicate)
    ```
  </Tab>

  <Tab title="Java">
    ```java theme={"system"}
    objectRDD.spatialPartitioning(joinQueryPartitioningType)
    queryWindowRDD.spatialPartitioning(objectRDD.getPartitioner)

    boolean buildOnSpatialPartitionedRDD = true // Set to TRUE only if run join query
    boolean usingIndex = true
    queryWindowRDD.buildIndex(IndexType.QUADTREE, buildOnSpatialPartitionedRDD)

    JavaPairRDD result = JoinQuery.SpatialJoinQueryFlat(objectRDD, queryWindowRDD, usingIndex, spatialPredicate)
    ```
  </Tab>
</Tabs>

The index should be built on either one of two SpatialRDDs. In general, you should build it on the larger SpatialRDD.

### Output format

<Tabs>
  <Tab title="Python">
    Result for this query is RDD which holds two GeoData objects within list of lists.
    Example:

    ```python theme={"system"}
    result.collect()
    ```

    ```
    [[GeoData, GeoData], [GeoData, GeoData] ...]
    ```

    It is possible to do some RDD operation on result data ex. Getting polygon centroid.

    ```python theme={"system"}
    result.map(lambda x: x[0].geom.centroid).collect()
    ```
  </Tab>

  <Tab title="Scala/Java">
    The output format of the spatial join query is a PairRDD. In this PairRDD, each object is a pair of two geometries. The left one is the geometry from objectRDD and the right one is the geometry from the queryWindowRDD.

    ```
    Point,Polygon
    Point,Polygon
    Point,Polygon
    Polygon,Polygon
    LineString,LineString
    Polygon,LineString
    ...
    ```

    Each object on the left is covered/intersected by the object on the right.
  </Tab>
</Tabs>

Sedona Python users: Please use JoinQueryRaw from the same module for methods

* spatialJoin
* DistanceJoinQueryFlat
* SpatialJoinQueryFlat

For better performance while converting to dataframe with adapter. That approach allows to avoid costly serialization between Python and jvm and in result operating on python object instead of native geometries.

Example:

```python theme={"system"}
from sedona.spark import *

object_rdd.analyze()

circle_rdd = CircleRDD(object_rdd, 0.1) ## Create a CircleRDD using the given distance
circle_rdd.analyze()

circle_rdd.spatialPartitioning(GridType.KDBTREE)
spatial_rdd.spatialPartitioning(circle_rdd.getPartitioner())

consider_boundary_intersection = False ## Only return gemeotries fully covered by each query window in queryWindowRDD
using_index = False

result = JoinQueryRaw.DistanceJoinQueryFlat(spatial_rdd, circle_rdd, using_index, consider_boundary_intersection)

gdf = Adapter.toDf(result, ["left_col1", ..., "lefcoln"], ["rightcol1", ..., "rightcol2"], spark)
```

## Write a Distance Join Query

<Warning>
  RDD distance joins are only reliable for points. For other geometry types, please use Spatial SQL.
</Warning>

A distance join query takes as input two Spatial RDD A and B and a distance. For each geometry in A, finds the geometries (from B) are within the given distance to it. A and B can be any geometry type and are not necessary to have the same geometry type. You can transform your CRS using Spatial SQL `ST_Transform`.

If you don't want to transform your data and are ok with sacrificing the query accuracy, you can use an approximate degree value for distance. Please use [this calculator](https://lucidar.me/en/online-unit-converter-length-to-angle/convert-degrees-to-meters/#online-converter).

Assume you now have two SpatialRDDs (typed or generic). You can use the following code to issue an Distance Join Query on them.

<Tabs>
  <Tab title="Python">
    The output format of the distance join query is a list of lists. Each list contains two GeoData objects.
    Example:

    ```python theme={"system"}
    result.collect()
    ```

    ```
    [[GeoData, GeoData], [GeoData, GeoData] ...]
    ```

    It is possible to do some RDD operation on result data ex. Getting polygon centroid.

    ```python theme={"system"}
    result.map(lambda x: x[0].geom.centroid).collect()
    ```
  </Tab>

  <Tab title="Scala/Java">
    The output format of the distance join query is a PairRDD. In this PairRDD, each object is a pair of two geometries. The left one is the geometry from objectRDD and the right one is the geometry from the circleRDD.

    ```
    Point,Polygon
    Point,Polygon
    Point,Polygon
    Polygon,Polygon
    LineString,LineString
    Polygon,LineString
    ...
    ```

    Each object on the left is within the given distance to the object on the right.

    ```java theme={"system"}
    objectRddA.analyze()

    CircleRDD circleRDD = new CircleRDD(objectRddA, 0.1) // Create a CircleRDD using the given distance

    circleRDD.spatialPartitioning(GridType.KDBTREE)
    objectRddB.spatialPartitioning(circleRDD.getPartitioner)

    SpatialPredicate spatialPredicate = SpatialPredicate.COVERED_BY // Only return gemeotries fully covered by each query window in queryWindowRDD
    boolean usingIndex = false

    JavaPairRDD result = JoinQuery.DistanceJoinQueryFlat(objectRddB, circleRDD, usingIndex, spatialPredicate)
    ```
  </Tab>
</Tabs>

Distance join can only accept `COVERED_BY` and `INTERSECTS` as spatial predicates. The rest part of the join query is same as the spatial join query.

The details of spatial partitioning in join query is [here](#use-spatial-partitioning).

The details of using spatial indexes in join query is [here](#use-spatial-indexes-2).

The output format of the distance join query is [here](#output-format-2).

<Note>
  Distance join query is equal to the following query in Spatial SQL:
</Note>

```sql theme={"system"}
SELECT superhero.name
FROM city, superhero
WHERE ST_Distance(city.geom, superhero.geom) <= 10;
```

Find the superheroes within 10 miles of each city

## Convert SpatialRDD to Spatial DataFrame

Use WherobotsDB DataFrame-RDD Adapter to convert a DataFrame to an SpatialRDD.

<Tabs>
  <Tab title="Python">
    ```python theme={"system"}
    from sedona.spark import *

    spatialDf = Adapter.toDf(spatialRDD, sedona)
    ```
  </Tab>

  <Tab title="Scala">
    ```scala theme={"system"}
    var spatialDf = Adapter.toDf(spatialRDD, sedona)
    ```
  </Tab>

  <Tab title="Java">
    ```java theme={"system"}
    Dataset<Row> spatialDf = Adapter.toDf(spatialRDD, sedona)
    ```
  </Tab>
</Tabs>

You may also manually specify a schema for the resulting DataFrame in case you require different column names or data
types. Note that string schemas and not all data types are supported\&mdash. At least one column for the user data must be provided.

<Tabs>
  <Tab title="Scala">
    ```scala theme={"system"}
    val schema = StructType(Array(
        StructField("county", GeometryUDT, nullable = true),
        StructField("name", StringType, nullable = true),
        StructField("price", DoubleType, nullable = true),
        StructField("age", IntegerType, nullable = true)
    ))
    val spatialDf = Adapter.toDf(spatialRDD, schema, sedona)
    ```
  </Tab>
</Tabs>

## Convert SpatialPairRDD to Spatial DataFrame

PairRDD is the result of a spatial join query or distance join query. Spatial SQL DataFrame-RDD Adapter can convert the result to a DataFrame. But you need to provide the name of other attributes.

<Tabs>
  <Tab title="Python">
    ```python theme={"system"}
    from sedona.spark import *

    joinResultDf = Adapter.toDf(jvm_sedona_rdd, ["poi_from_id", "poi_from_name"], ["poi_to_id", "poi_to_name"], spark))
    ```
  </Tab>

  <Tab title="Scala">
    ```scala theme={"system"}
    var joinResultDf = Adapter.toDf(joinResultPairRDD, Seq("left_attribute1", "left_attribute2"), Seq("right_attribute1", "right_attribute2"), sedona)
    ```
  </Tab>

  <Tab title="Java">
    ```java theme={"system"}
    import scala.collection.JavaConverters;

    List leftFields = new ArrayList<>(Arrays.asList("c1", "c2", "c3"));
    List rightFields = new ArrayList<>(Arrays.asList("c4", "c5", "c6"));
    Dataset joinResultDf = Adapter.toDf(joinResultPairRDD, JavaConverters.asScalaBuffer(leftFields).toSeq(), JavaConverters.asScalaBuffer(rightFields).toSeq(), sedona);
    ```
  </Tab>
</Tabs>

or you can use the attribute names directly from the input RDD

<Tabs>
  <Tab title="Python">
    ```python theme={"system"}
    from sedona.spark import *

    joinResultDf = Adapter.toDf(result_pair_rdd, leftRdd.fieldNames, rightRdd.fieldNames, spark)
    ```
  </Tab>

  <Tab title="Scala">
    ```scala theme={"system"}
    import scala.collection.JavaConversions._
    var joinResultDf = Adapter.toDf(joinResultPairRDD, leftRdd.fieldNames, rightRdd.fieldNames, sedona)
    ```
  </Tab>

  <Tab title="Java">
    ```java theme={"system"}
    import scala.collection.JavaConverters;
    Dataset joinResultDf = Adapter.toDf(joinResultPairRDD, JavaConverters.asScalaBuffer(leftRdd.fieldNames).toSeq(), JavaConverters.asScalaBuffer(rightRdd.fieldNames).toSeq(), sedona);
    ```
  </Tab>
</Tabs>

You may also manually specify a schema for the resulting DataFrame in case you require different column names or data
types. Note that string schemas and not all data types are supported\&mdash. Columns for the left and right user data must be provided.

<Tabs>
  <Tab title="Scala">
    ```scala theme={"system"}
    val schema = StructType(Array(
        StructField("leftGeometry", GeometryUDT, nullable = true),
        StructField("name", StringType, nullable = true),
        StructField("price", DoubleType, nullable = true),
        StructField("age", IntegerType, nullable = true),
        StructField("rightGeometry", GeometryUDT, nullable = true),
        StructField("category", StringType, nullable = true)
    ))
    val joinResultDf = Adapter.toDf(joinResultPairRDD, schema, sedona)
    ```
  </Tab>
</Tabs>
