PySpark 4 (and delta-spark 4) is now the baseline for spark-fuse, including automatic Delta Scala
classifier detection with an override via SPARK_FUSE_DELTA_SCALA_SUFFIX.
split_by_date_formats adopts try_to_timestamp where available, eliminating ANSI errors on invalid
rows while still surfacing unmatched records through the existing modes.
Legacy catalog helpers (Unity/Hive), their CLI commands, and the experimental Qdrant connector have
been removed to keep the 1.0.0 surface area focused on the actively maintained toolchain.
Installation and demo docs now reflect the PySpark 4 requirements and updated Java guidance.
Ensure your environments (local notebooks, CI, clusters) install PySpark 4.x and delta-spark 4.x.
Earlier versions are no longer supported and the package metadata enforces these minimums.
When customizing Delta Lake coordinates, set SPARK_FUSE_DELTA_SCALA_SUFFIX (for example, _2.12)
to override the auto-detected Scala binary suffix.
If you depended on the removed catalog helpers or Qdrant stub, pin to spark-fuse==0.4.0 until you
migrate those workflows to first-party alternatives.
Require PySpark 4.x (and delta-spark 4.x) in the package metadata and auto-detect the Scala binary
when configuring Delta Lake jars, with an escape hatch via SPARK_FUSE_DELTA_SCALA_SUFFIX.
split_by_date_formats now relies on try_to_timestamp when available so PySpark 4 no longer raises
ANSI parsing errors for invalid rows; unmatched rows are still surfaced per the chosen mode.