Skip to main content
The Weighting object provides methods for creating spatial weight matrices that define neighborhood relationships between spatial features. These weights are commonly used in spatial autocorrelation analysis, hotspot detection, and other spatial statistics operations.

addBinaryDistanceBandColumn()

Annotates a dataframe with a weights column for each data record containing the other members within the threshold and their weight.
def addBinaryDistanceBandColumn(
  dataframe: DataFrame,
  threshold: Double,
  includeZeroDistanceNeighbors: Boolean = true,
  includeSelf: Boolean = false,
  geometry: String = null,
  useSpheroid: Boolean = false,
  savedAttributes: Seq[String] = null,
  resultName: String = "weights"
): DataFrame
Weights will always be 1.0. The dataframe should contain at least one GeometryType column. Rows must be unique. If one geometry column is present it will be used automatically. If two are present, the one named ‘geometry’ will be used. If more than one are present and neither is named ‘geometry’, the column name must be provided.

Parameters

dataframe
DataFrame
required
DataFrame with geometry column
threshold
Double
required
Distance threshold for considering neighbors
includeZeroDistanceNeighbors
Boolean
default:"true"
Whether to include neighbors that are 0 distance. If 0 distance neighbors are included and binary is false, values are infinity as per the floating point spec (divide by 0)
includeSelf
Boolean
default:"false"
Whether to include self in the list of neighbors
geometry
String
default:"null"
Name of the geometry column
useSpheroid
Boolean
default:"false"
Whether to use a cartesian or spheroidal distance calculation. Default is false
savedAttributes
Seq[String]
default:"null"
The attributes to save in the neighbor column. Default is all columns
resultName
String
default:"weights"
The name of the resulting column. Default is ‘weights’

Returns

The input DataFrame with a weight column added containing neighbors and their weights (always 1) added to each row

addDistanceBandColumn()

Annotates a dataframe with a weights column for each data record containing the other members within the threshold and their weight.
def addDistanceBandColumn(
  dataframe: DataFrame,
  threshold: Double,
  binary: Boolean = true,
  alpha: Double = -1.0,
  includeZeroDistanceNeighbors: Boolean = false,
  includeSelf: Boolean = false,
  selfWeight: Double = 1.0,
  geometry: String = null,
  useSpheroid: Boolean = false,
  savedAttributes: Seq[String] = null,
  resultName: String = "weights"
): DataFrame
The dataframe should contain at least one GeometryType column. Rows must be unique. If one geometry column is present it will be used automatically. If two are present, the one named ‘geometry’ will be used. If more than one are present and neither is named ‘geometry’, the column name must be provided.

Parameters

dataframe
DataFrame
required
DataFrame with geometry column
threshold
Double
required
Distance threshold for considering neighbors
binary
Boolean
default:"true"
Whether to use binary weights or inverse distance weights for neighbors (dist^alpha)
alpha
Double
default:"-1.0"
Alpha to use for inverse distance weights ignored when binary is true
includeZeroDistanceNeighbors
Boolean
default:"false"
Whether to include neighbors that are 0 distance. If 0 distance neighbors are included and binary is false, values are infinity as per the floating point spec (divide by 0)
includeSelf
Boolean
default:"false"
Whether to include self in the list of neighbors
selfWeight
Double
default:"1.0"
The value to use for the self weight
geometry
String
default:"null"
Name of the geometry column
useSpheroid
Boolean
default:"false"
Whether to use a cartesian or spheroidal distance calculation. Default is false
savedAttributes
Seq[String]
default:"null"
The attributes to save in the neighbor column. Default is all columns
resultName
String
default:"weights"
The name of the resulting column. Default is ‘weights’

Returns

The input DataFrame with a weight column added containing neighbors and their weights added to each row

addWeightedDistanceBandColumn()

Annotates a dataframe with a weights column for each data record containing the other members within the threshold and their weight.
def addWeightedDistanceBandColumn(
  dataframe: DataFrame,
  threshold: Double,
  alpha: Double = -1.0,
  includeZeroDistanceNeighbors: Boolean = false,
  includeSelf: Boolean = false,
  selfWeight: Double = 1.0,
  geometry: String = null,
  useSpheroid: Boolean = false,
  savedAttributes: Seq[String] = null,
  resultName: String = "weights"
): DataFrame
Weights will be dist^alpha. The dataframe should contain at least one GeometryType column. Rows must be unique. If one geometry column is present it will be used automatically. If two are present, the one named ‘geometry’ will be used. If more than one are present and neither is named ‘geometry’, the column name must be provided.

Parameters

dataframe
DataFrame
required
DataFrame with geometry column
threshold
Double
required
Distance threshold for considering neighbors
alpha
Double
default:"-1.0"
Alpha to use for inverse distance weights. Computation is dist^alpha. Default is -1.0
includeZeroDistanceNeighbors
Boolean
default:"false"
Whether to include neighbors that are 0 distance. If 0 distance neighbors are included and binary is false, values are infinity as per the floating point spec (divide by 0)
includeSelf
Boolean
default:"false"
Whether to include self in the list of neighbors
selfWeight
Double
default:"1.0"
The weight to provide for the self as its own neighbor. Default is 1.0
geometry
String
default:"null"
Name of the geometry column
useSpheroid
Boolean
default:"false"
Whether to use a cartesian or spheroidal distance calculation. Default is false
savedAttributes
Seq[String]
default:"null"
The attributes to save in the neighbor column. Default is all columns
resultName
String
default:"weights"
The name of the resulting column. Default is ‘weights’

Returns

The input DataFrame with a weight column added containing neighbors and their weights (dist^alpha) added to each row

Usage Examples

import org.apache.sedona.stats.Weighting

// Binary distance band weighting (weights are always 1.0)
val binaryWeights = Weighting.addBinaryDistanceBandColumn(
  dataframe = spatialDf,
  threshold = 1000.0,
  includeZeroDistanceNeighbors = true,
  includeSelf = false,
  geometry = "geometry",
  useSpheroid = false,
  resultName = "binary_weights"
)

// Distance band with binary or inverse distance weights
val distanceWeights = Weighting.addDistanceBandColumn(
  dataframe = spatialDf,
  threshold = 1000.0,
  binary = false,
  alpha = -1.0,
  includeZeroDistanceNeighbors = false,
  includeSelf = true,
  selfWeight = 1.0,
  geometry = "geometry",
  useSpheroid = false,
  resultName = "distance_weights"
)

// Weighted distance band with inverse distance weights
val weightedDistanceWeights = Weighting.addWeightedDistanceBandColumn(
  dataframe = spatialDf,
  threshold = 1000.0,
  alpha = -2.0,
  includeZeroDistanceNeighbors = false,
  includeSelf = true,
  selfWeight = 1.0,
  geometry = "geometry",
  useSpheroid = true,
  resultName = "weighted_weights"
)

// Using specific saved attributes
val limitedWeights = Weighting.addDistanceBandColumn(
  dataframe = spatialDf,
  threshold = 500.0,
  savedAttributes = Seq("id", "name", "value"),
  resultName = "neighbor_weights"
)

Weight Types

Binary Weights

Binary weights assign a value of 1.0 to all neighbors within the threshold distance and 0.0 to all others. This is the simplest form of spatial weighting.

Inverse Distance Weights

Inverse distance weights use the formula dist^alpha where alpha is typically negative (e.g., -1.0 or -2.0). Closer neighbors receive higher weights, and farther neighbors receive lower weights.

Distance Band Weights

Distance band weights can be either binary or inverse distance based, depending on the binary parameter. This provides flexibility in choosing the weighting scheme.

Notes

  • All methods require a DataFrame with at least one geometry column
  • Rows in the DataFrame must be unique
  • If multiple geometry columns exist and none is named ‘geometry’, the column name must be specified
  • The useSpheroid parameter determines whether to use Cartesian (planar) or spheroidal (great circle) distance calculations
  • Zero distance neighbors can cause infinite weights when using inverse distance weighting