Lateral joins in Spark

In this short article, I will explain lateral joins in Spark. I will demonstrate lateral joins using a simple example in Scala, and also show alternative approaches like inner join calculation with window functions and pure Spark SQL. The def lateralJoin(right: Dataset[_]): DataFrame function on Dataset was introduced in version 4.0.0, but according to the documentation, the lateral subquery feature was first introduced in version 3.2.0. Source code is located here: https://github.com/JurajBurian/spark-playground. ...

October 20, 2025 · 5 min · 874 words · Me