> ## Documentation Index
> Fetch the complete documentation index at: https://docs.wherobots.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Weighting Python Module

> Python API for spatial weighting operations.

The weighting module provides functions for creating spatial weight matrices that define neighborhood relationships between spatial features. These weights are commonly used in spatial autocorrelation analysis, hotspot detection, and other spatial statistics operations.

## `add_binary_distance_band_column()`

Annotates a dataframe with a weights column containing the other records within the threshold and their weight.

```python theme={"system"}
def add_binary_distance_band_column(
    dataframe: DataFrame,
    threshold: float,
    include_zero_distance_neighbors: bool = True,
    include_self: bool = False,
    geometry: Optional[str] = None,
    use_spheroid: bool = False,
    saved_attributes: Optional[List[str]] = None,
    result_name: str = 'weights'
) -> DataFrame
```

Weights will always be 1.0. The dataframe should contain at least one GeometryType column. Rows must be unique. If one geometry column is present it will be used automatically. If two are present, the one named 'geometry' will be used. If more than one are present and neither is named 'geometry', the column name must be provided.

### Parameters

<ParamField path="dataframe" type="DataFrame" required>
  DataFrame with geometry column
</ParamField>

<ParamField path="threshold" type="float" required>
  Distance threshold for considering neighbors
</ParamField>

<ParamField path="include_zero_distance_neighbors" type="bool" default="True">
  Whether to include neighbors that are 0 distance. If 0 distance neighbors are included and binary is false, values are infinity as per the floating point spec (divide by 0)
</ParamField>

<ParamField path="include_self" type="bool" default="False">
  Whether to include self in the list of neighbors
</ParamField>

<ParamField path="geometry" type="Optional[str]" default="None">
  Name of the geometry column
</ParamField>

<ParamField path="use_spheroid" type="bool" default="False">
  Whether to use a cartesian or spheroidal distance calculation. Default is false
</ParamField>

<ParamField path="saved_attributes" type="Optional[List[str]]" default="None">
  The attributes to save in the neighbor column. Default is all columns
</ParamField>

<ParamField path="result_name" type="str" default="'weights'">
  The name of the resulting column. Default is 'weights'
</ParamField>

### Returns

<ResponseField path="DataFrame" type="DataFrame">
  The input DataFrame with a weight column added containing neighbors and their weights (always 1) added to each row
</ResponseField>

## `add_distance_band_column()`

Annotates a dataframe with a weights column containing the other records within the threshold and their weight.

```python theme={"system"}
def add_distance_band_column(
    dataframe: DataFrame,
    threshold: float,
    binary: bool = True,
    alpha: float = -1.0,
    include_zero_distance_neighbors: bool = False,
    include_self: bool = False,
    self_weight: float = 1.0,
    geometry: Optional[str] = None,
    use_spheroid: bool = False,
    saved_attributes: Optional[List[str]] = None,
    result_name: str = 'weights'
) -> DataFrame
```

The dataframe should contain at least one GeometryType column. Rows must be unique. If one geometry column is present it will be used automatically. If two are present, the one named 'geometry' will be used. If more than one are present and neither is named 'geometry', the column name must be provided.

### Parameters

<ParamField path="dataframe" type="DataFrame" required>
  DataFrame with geometry column
</ParamField>

<ParamField path="threshold" type="float" required>
  Distance threshold for considering neighbors
</ParamField>

<ParamField path="binary" type="bool" default="True">
  Whether to use binary weights or inverse distance weights for neighbors (dist^alpha)
</ParamField>

<ParamField path="alpha" type="float" default="-1.0">
  Alpha to use for inverse distance weights ignored when binary is true
</ParamField>

<ParamField path="include_zero_distance_neighbors" type="bool" default="False">
  Whether to include neighbors that are 0 distance. If 0 distance neighbors are included and binary is false, values are infinity as per the floating point spec (divide by 0)
</ParamField>

<ParamField path="include_self" type="bool" default="False">
  Whether to include self in the list of neighbors
</ParamField>

<ParamField path="self_weight" type="float" default="1.0">
  The value to use for the self weight
</ParamField>

<ParamField path="geometry" type="Optional[str]" default="None">
  Name of the geometry column
</ParamField>

<ParamField path="use_spheroid" type="bool" default="False">
  Whether to use a cartesian or spheroidal distance calculation. Default is false
</ParamField>

<ParamField path="saved_attributes" type="Optional[List[str]]" default="None">
  The attributes to save in the neighbor column. Default is all columns
</ParamField>

<ParamField path="result_name" type="str" default="'weights'">
  The name of the resulting column. Default is 'weights'
</ParamField>

### Returns

<ResponseField path="DataFrame" type="DataFrame">
  The input DataFrame with a weight column added containing neighbors and their weights added to each row
</ResponseField>

## `add_weighted_distance_band_column()`

Annotates a dataframe with a weights column containing the other records within the threshold and their weight.

```python theme={"system"}
def add_weighted_distance_band_column(
    dataframe: DataFrame,
    threshold: float,
    alpha: float,
    include_zero_distance_neighbors: bool = True,
    include_self: bool = False,
    self_weight: float = 1.0,
    geometry: Optional[str] = None,
    use_spheroid: bool = False,
    saved_attributes: Optional[List[str]] = None,
    result_name: str = 'weights'
) -> DataFrame
```

Weights will be distance^alpha. The dataframe should contain at least one GeometryType column. Rows must be unique. If one geometry column is present it will be used automatically. If two are present, the one named 'geometry' will be used. If more than one are present and neither is named 'geometry', the column name must be provided.

### Parameters

<ParamField path="dataframe" type="DataFrame" required>
  DataFrame with geometry column
</ParamField>

<ParamField path="threshold" type="float" required>
  Distance threshold for considering neighbors
</ParamField>

<ParamField path="alpha" type="float" required>
  Alpha to use for inverse distance weights. Computation is dist^alpha. Default is -1.0
</ParamField>

<ParamField path="include_zero_distance_neighbors" type="bool" default="True">
  Whether to include neighbors that are 0 distance. If 0 distance neighbors are included and binary is false, values are infinity as per the floating point spec (divide by 0)
</ParamField>

<ParamField path="include_self" type="bool" default="False">
  Whether to include self in the list of neighbors
</ParamField>

<ParamField path="self_weight" type="float" default="1.0">
  The value to use for the self weight. Default is 1.0
</ParamField>

<ParamField path="geometry" type="Optional[str]" default="None">
  Name of the geometry column
</ParamField>

<ParamField path="use_spheroid" type="bool" default="False">
  Whether to use a cartesian or spheroidal distance calculation. Default is false
</ParamField>

<ParamField path="saved_attributes" type="Optional[List[str]]" default="None">
  The attributes to save in the neighbor column. Default is all columns
</ParamField>

<ParamField path="result_name" type="str" default="'weights'">
  The name of the resulting column. Default is 'weights'
</ParamField>

### Returns

<ResponseField path="DataFrame" type="DataFrame">
  The input DataFrame with a weight column added containing neighbors and their weights (dist^alpha) added to each row
</ResponseField>

## Usage Examples

```python theme={"system"}
from sedona.spark.stats.weighting import (
    add_binary_distance_band_column,
    add_distance_band_column,
    add_weighted_distance_band_column
)

# Binary distance band weighting (weights are always 1.0)
binary_weights_df = add_binary_distance_band_column(
    dataframe=spatial_df,
    threshold=1000.0,
    include_zero_distance_neighbors=True,
    include_self=False,
    geometry="geometry",
    use_spheroid=False,
    result_name="binary_weights"
)

# Distance band with binary or inverse distance weights
distance_weights_df = add_distance_band_column(
    dataframe=spatial_df,
    threshold=1000.0,
    binary=False,
    alpha=-1.0,
    include_zero_distance_neighbors=False,
    include_self=True,
    self_weight=1.0,
    geometry="geometry",
    use_spheroid=False,
    result_name="distance_weights"
)

# Weighted distance band with inverse distance weights
weighted_distance_df = add_weighted_distance_band_column(
    dataframe=spatial_df,
    threshold=1000.0,
    alpha=-2.0,
    include_zero_distance_neighbors=False,
    include_self=True,
    self_weight=1.0,
    geometry="geometry",
    use_spheroid=True,
    result_name="weighted_weights"
)

# Using specific saved attributes
limited_weights_df = add_distance_band_column(
    dataframe=spatial_df,
    threshold=500.0,
    saved_attributes=["id", "name", "value"],
    result_name="neighbor_weights"
)
```

## Weight Types

### Binary Weights

Binary weights assign a value of 1.0 to all neighbors within the threshold distance and 0.0 to all others. This is the simplest form of spatial weighting and is created using `add_binary_distance_band_column()`.

### Inverse Distance Weights

Inverse distance weights use the formula `dist^alpha` where `alpha` is typically negative (e.g., -1.0 or -2.0). Closer neighbors receive higher weights, and farther neighbors receive lower weights. Use `add_weighted_distance_band_column()` for this type.

### Flexible Distance Band Weights

The `add_distance_band_column()` function provides flexibility to choose between binary or inverse distance weighting based on the `binary` parameter, making it the most versatile option.

## Notes

* All functions require a DataFrame with at least one geometry column
* Rows in the DataFrame must be unique
* If multiple geometry columns exist and none is named 'geometry', the column name must be specified
* The `use_spheroid` parameter determines whether to use Cartesian (planar) or spheroidal (great circle) distance calculations
* Zero distance neighbors can cause infinite weights when using inverse distance weighting (`binary=False`)
* The `saved_attributes` parameter allows you to control which columns are preserved in the neighbor information
