Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 59 additions & 0 deletions docs/sql-ref-geospatial-types.md
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,65 @@ SELECT ST_Srid(ST_SetSrid(ST_GeomFromWKB(X'0101000000000000000000F03F00000000000
* **Mixed-SRID columns** (`GEOMETRY(ANY)` or `GEOGRAPHY(ANY)`): Values can have different SRIDs. Only valid SRIDs are allowed.
* **Storage**: Parquet, Delta, and Iceberg store geometry/geography with a fixed SRID per column; mixed-SRID types are for in-memory/query use. When writing to these formats, a concrete (fixed) SRID is required.

### Supported SRIDs

Spark includes a pre-built SRID registry that combines coordinate systems from the PROJ database with OGC standard overrides. This registry enables validation and proper handling of coordinate systems for geospatial data.

**SRID Compatibility Rules:**
- **GEOMETRY** accepts all SRIDs in the registry (geographic + projected + SRID 0)
- **GEOGRAPHY** only accepts geographic SRIDs (latitude/longitude coordinate systems)

#### PROJ Version by Spark Release

| Spark Version | PROJ Version |
|---------------|--------------|
| 4.2.0 | 9.7.1 |

The SRID registry is pinned to the PROJ version shown above and is not synced live with external databases.

#### OGC Standard Overrides

Spark applies the following OGC standard overrides to specific SRIDs from the PROJ database:

| SRID | PROJ CRS Identifier | OGC CRS Identifier | Description |
|------|---------------------|-------------------|-------------|
| 4326 | `EPSG:4326` | `OGC:CRS84` | WGS 84 (longitude/latitude order per OGC standard) |
| 4267 | `EPSG:4267` | `OGC:CRS27` | NAD27 |
| 4269 | `EPSG:4269` | `OGC:CRS83` | NAD83 |


#### Commonly Used SRIDs

| SRID | CRS Identifier | Name | CRS Type | Description |
|------|----------------|------|----------|-------------|
| 0 | `SRID:0` | Unspecified | Cartesian | Coordinates with no defined CRS (default for `ST_GeomFromWKB(wkb)`) |
| 4326 | `OGC:CRS84` | WGS 84 | Geographic | World Geodetic System 1984 (longitude/latitude), GPS coordinates, global data (default for GEOGRAPHY) |
| 4267 | `OGC:CRS27` | NAD27 | Geographic | North American Datum 1927 |
| 4269 | `OGC:CRS83` | NAD83 | Geographic | North American Datum 1983 |
| 3857 | `EPSG:3857` | Web Mercator | Projected | Pseudo-Mercator projection used by web mapping services |

**Notes:**
* `GEOMETRY(0)` means a fixed SRID of 0. For mixed per-row SRIDs, use `GEOMETRY(ANY)`.
* [Parquet](https://github.com/apache/parquet-format/blob/master/Geospatial.md)
and [Iceberg](https://github.com/apache/iceberg/blob/main/format/spec.md) geospatial
specifications require a fixed SRID per column, so they do not support persisting
`GEOMETRY(ANY)` or `GEOGRAPHY(ANY)`.

#### SRID Validation

**Invalid SRID (not in registry):**
```sql
SELECT ST_GeomFromWKB(X'0101000000000000000000F03F0000000000000040', 99999);
-- Throws [ST_INVALID_SRID_VALUE]
```

**Projected SRID with GEOGRAPHY type:**
```sql
CREATE TABLE invalid_geo (id BIGINT, loc GEOGRAPHY(3857));
-- Throws [ST_INVALID_SRID_VALUE] (3857 is projected, not geographic)
```


### Data Types Reference

For the full list of supported data types and API usage in Scala, Java, Python, and SQL, see [Data Types](sql-ref-datatypes.html).