Skip to content

[GH-2881] Add ST_Box2D(geom) scalar function#2890

Open
jiayuasu wants to merge 3 commits intoapache:masterfrom
jiayuasu:feature/st-box2d-scalar
Open

[GH-2881] Add ST_Box2D(geom) scalar function#2890
jiayuasu wants to merge 3 commits intoapache:masterfrom
jiayuasu:feature/st-box2d-scalar

Conversation

@jiayuasu
Copy link
Copy Markdown
Member

@jiayuasu jiayuasu commented May 2, 2026

Did you read the Contributor Guide?

Is this PR related to a ticket?

What changes were proposed in this PR?

Adds the `ST_Box2D(geom) -> Box2D` scalar function. Returns the planar bounding box of the input geometry as a `Box2D`, or SQL NULL for null/empty input (PostGIS-compatible).

This is the first user-facing function on top of the `Box2D` type and UDT introduced in #2878.

  • `common/.../Functions.java` — adds `box2D(Geometry)` static helper that delegates to `Box2D.fromGeometry` (already merged).
  • `spark/common/.../expressions/InferredExpression.scala` — registers `Box2D` as an inferrable return type. Adds the `InferrableType[Box2D]` instance, the struct-row serializer (via `Box2DUDT.serialize`), and the Spark return-type mapping (`Box2D → Box2DUDT`). Future Box2D-returning functions plug in for free.
  • `spark/common/.../expressions/Functions.scala` — `ST_Box2D` case class extending `InferredExpression(Functions.box2D _)`.
  • `spark/common/.../UDF/Catalog.scala` — registers in the `boundingBoxExprs` group alongside `ST_Envelope`.

How was this patch tested?

`functionTestScala` (added "Passed ST_Box2D"):

  • Polygon input → expected xmin/ymin/xmax/ymax via `row.getAsBox2D`.
  • `POINT EMPTY` input → NULL.
  • `CAST(NULL AS GEOMETRY)` input → NULL.

Did this PR include necessary documentation updates?

Returns the planar bounding box of a Geometry as a Box2D, or SQL NULL
for null/empty input. Adds the Box2D type plumbing to
InferredExpression (InferrableType[Box2D] instance, struct serializer
via Box2DUDT, Spark return-type mapping) so future ST_* functions
returning Box2D can plug in by reusing the same machinery.

Closes apache#2881.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces the first Spark SQL scalar function that returns the new Box2D type: ST_Box2D(geom). It wires the function through Sedona’s common geometry helpers, Spark expression/type inference, and SQL function catalog so planar geometry bounding boxes can be returned as a first-class Box2D value.

Changes:

  • Added Functions.box2D(Geometry) in the common module and exposed it as a new Spark SQL expression ST_Box2D.
  • Extended inferred Spark return-type handling so Box2D results serialize through Box2DUDT.
  • Registered the function in the SQL catalog and added basic SQL-level tests for normal, empty, and null inputs.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
common/src/main/java/org/apache/sedona/common/Functions.java Adds the common helper that computes a Box2D from a geometry.
spark/common/src/main/scala/org/apache/spark/sql/sedona_sql/expressions/InferredExpression.scala Adds inferred-type support and serialization for Box2D return values.
spark/common/src/main/scala/org/apache/spark/sql/sedona_sql/expressions/Functions.scala Defines the new ST_Box2D Spark expression.
spark/common/src/main/scala/org/apache/sedona/sql/UDF/Catalog.scala Registers ST_Box2D in the SQL function catalog.
spark/common/src/test/scala/org/apache/sedona/sql/functionTestScala.scala Adds SQL-level coverage for ST_Box2D result and null/empty behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

// Bounding-Box-Functions
val boundingBoxExprs: Seq[FunctionDescription] = Seq(
function[ST_BoundingDiagonal](),
function[ST_Box2D](),
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scala DataFrame API wrappers (st_functions.*) for the Box2D Phase 1 surface are deliberately deferred and tracked in #2891. Doing them in one coherent batch after the SQL functions land reads cleaner than wrapping each in isolation.

Comment on lines +272 to +277
} else if (t =:= typeOf[Box2D]) { output =>
if (output != null) {
Box2DUDT().serialize(output.asInstanceOf[Box2D])
} else {
null
}
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 8674e2f. Replaced Box2DUDT() (which calls apply() and allocates) with the case object Box2DUDT (singleton), captured once outside the closure.

Box2DUDT() (with parens) calls apply() which constructs a fresh
Box2DUDT for every row. The case object Box2DUDT is already the
singleton — capturing it once outside the closure removes the
per-row allocation in the hot path.
… test

Spark 4.x's parser rejects CAST(NULL AS GEOMETRY) because GEOMETRY is
a UDT, not a primitive data type the parser recognizes. Pass NULL
through ST_GeomFromText instead — exercises the same null-propagation
path.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement ST_Box2D(geom) scalar function

2 participants